RE: Spark 3.4.1 and Hive 3.1.3
Hi Yasukazu, I tried by replacing the jar though the spark code didn’t work but the vulnerability was removed. But I agree that even 3.1.3 has other vulnerabilities listed on maven page but these are medium level vulnerabilities. We are currently targeting Critical and High vulnerabilities only. Thank, Sanket From: Nagatomi Yasukazu Sent: Friday, September 8, 2023 9:35 AM To: Agrawal, Sanket Cc: Chao Sun ; Yeachan Park ; user@spark.apache.org Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi Sanket, While migrating to Hive 3.1.3 may resolve many issues, the link below suggests that there might still be some vulnerabilities present. Do you think the specific vulnerability you're concerned about can be addressed with Hive 3.1.3? https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3<https://secure-web.cisco.com/1d8xNpkOivsOWbg-kqI-qz9vedICxnHHvu0yw4y8v_SbNPrpvSNDikg5zip-GQVRwP3-PzJM9Pmf8BPmzFrm928Gtcil8BlyX6K9SWRYEPkvoy2mlszuyHJEBwMGJefJEFF7YgWG64_-eTUCqIESkh93X2QhW_dKBdlIHEdX5k77ywPE-w5EkeVz7EpZ2DVfepiqKNZDInUz9CmQn1C--P458DmITAO8NIcd1Nzdcy1CQZlQfAgO7pbbuw97mY3pj_EFraAZbFfElMRDa_a-4zGhAixYzTy9D4EqYzIwlXU1jh7_jT76jNGhRYOWyvdz1m1V3vDRLpjuXd4jW18HP-_HRUwk2krUoy1ihmoqpVSCmpQFEwI5ynOJ2s3IcT2GUGSSzRiPlYVaLO7gpKHYlEzXCpaUxNq2BcZITs2BcVoCSMFxYZzLItaBEVYLcbvanM82cNp01BksuX4DixDh-gdCDGXJQIQjqTIWXkCCFRcYbUkPJjw3GmkxRl2OGDn3AfJchJjSiT8Th7OU1Av4_Mg/https%3A%2F%2Fmvnrepository.com%2Fartifact%2Forg.apache.hive%2Fhive-exec%2F3.1.3> Regards, Yasukazu 2023年9月8日(金) 12:36 Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>>: Hi Chao, The reason to migrate to Hive 3.1.3 is to remove a vulnerability from hive-exec-2.3.9.jar. Thanks Sanket From: Chao Sun mailto:sunc...@apache.org>> Sent: Thursday, September 7, 2023 10:23 PM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> Cc: Yeachan Park mailto:yeachan...@gmail.com>>; user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi Sanket, Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of work to upgrade the Hive version to 3.x and up. Normally though, you only need the Hive client in Spark to talk to HiveMetastore (HMS) for things like table or partition metadata information. In this case, Hive 2.3.9 used by Spark is already capable of communicating with HMS of other versions like Hive 3.x. So, could you share a bit of context why you want to use Hive 3.1.3 with Spark? Chao On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> wrote: Hi I Tried using the maven option and it’s working. But we are not allowed to download jars at runtime from maven because of some security restrictions. So, I tried again with downloading hive 3.1.3 and giving the location of jars and it worked this time. But now in our docker image we have 40 new Critical vulnerabilities due to Hive (scanned by AWS Inspector). So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But when I do so the build is failing while compiling the files in /spark/sql/hive. But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is completed successfully. Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? Thanks, Sanket A. From: Yeachan Park mailto:yeachan...@gmail.com>> Sent: Tuesday, September 5, 2023 8:52 PM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 What's the full traceback when you run the same thing via spark-shell? So something like: $SPARK_HOME/bin/spark-shell \ --conf "spark.sql.hive.metastore.version=3.1.3" \ --conf "spark.sql.hive.metastore.jars=path" \ --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" W.r.t building hive, there's no need - either download it from https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> or use the maven option like Yasukazu suggested. If you do want to build it make sure you are using Java 8 to do so. On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket mailto:sankeagra...@deloitte.com>> wrote: Hi, I tried pointing to hive 3.1.3 using the below command. But still getting error. I see that the spark-hive-thr
RE: Spark 3.4.1 and Hive 3.1.3
Hi, I tried replacing just this JAR but getting errors. From: Nagatomi Yasukazu Sent: Friday, September 8, 2023 9:35 AM To: Agrawal, Sanket Cc: Chao Sun ; Yeachan Park ; user@spark.apache.org Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi Sanket, While migrating to Hive 3.1.3 may resolve many issues, the link below suggests that there might still be some vulnerabilities present. Do you think the specific vulnerability you're concerned about can be addressed with Hive 3.1.3? https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3<https://secure-web.cisco.com/1d8xNpkOivsOWbg-kqI-qz9vedICxnHHvu0yw4y8v_SbNPrpvSNDikg5zip-GQVRwP3-PzJM9Pmf8BPmzFrm928Gtcil8BlyX6K9SWRYEPkvoy2mlszuyHJEBwMGJefJEFF7YgWG64_-eTUCqIESkh93X2QhW_dKBdlIHEdX5k77ywPE-w5EkeVz7EpZ2DVfepiqKNZDInUz9CmQn1C--P458DmITAO8NIcd1Nzdcy1CQZlQfAgO7pbbuw97mY3pj_EFraAZbFfElMRDa_a-4zGhAixYzTy9D4EqYzIwlXU1jh7_jT76jNGhRYOWyvdz1m1V3vDRLpjuXd4jW18HP-_HRUwk2krUoy1ihmoqpVSCmpQFEwI5ynOJ2s3IcT2GUGSSzRiPlYVaLO7gpKHYlEzXCpaUxNq2BcZITs2BcVoCSMFxYZzLItaBEVYLcbvanM82cNp01BksuX4DixDh-gdCDGXJQIQjqTIWXkCCFRcYbUkPJjw3GmkxRl2OGDn3AfJchJjSiT8Th7OU1Av4_Mg/https%3A%2F%2Fmvnrepository.com%2Fartifact%2Forg.apache.hive%2Fhive-exec%2F3.1.3> Regards, Yasukazu 2023年9月8日(金) 12:36 Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>>: Hi Chao, The reason to migrate to Hive 3.1.3 is to remove a vulnerability from hive-exec-2.3.9.jar. Thanks Sanket From: Chao Sun mailto:sunc...@apache.org>> Sent: Thursday, September 7, 2023 10:23 PM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> Cc: Yeachan Park mailto:yeachan...@gmail.com>>; user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi Sanket, Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of work to upgrade the Hive version to 3.x and up. Normally though, you only need the Hive client in Spark to talk to HiveMetastore (HMS) for things like table or partition metadata information. In this case, Hive 2.3.9 used by Spark is already capable of communicating with HMS of other versions like Hive 3.x. So, could you share a bit of context why you want to use Hive 3.1.3 with Spark? Chao On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> wrote: Hi I Tried using the maven option and it’s working. But we are not allowed to download jars at runtime from maven because of some security restrictions. So, I tried again with downloading hive 3.1.3 and giving the location of jars and it worked this time. But now in our docker image we have 40 new Critical vulnerabilities due to Hive (scanned by AWS Inspector). So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But when I do so the build is failing while compiling the files in /spark/sql/hive. But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is completed successfully. Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? Thanks, Sanket A. From: Yeachan Park mailto:yeachan...@gmail.com>> Sent: Tuesday, September 5, 2023 8:52 PM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 What's the full traceback when you run the same thing via spark-shell? So something like: $SPARK_HOME/bin/spark-shell \ --conf "spark.sql.hive.metastore.version=3.1.3" \ --conf "spark.sql.hive.metastore.jars=path" \ --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" W.r.t building hive, there's no need - either download it from https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> or use the maven option like Yasukazu suggested. If you do want to build it make sure you are using Java 8 to do so. On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket mailto:sankeagra...@deloitte.com>> wrote: Hi, I tried pointing to hive 3.1.3 using the below command. But still getting error. I see that the spark-hive-thriftserver_2.12/3.4.1 and spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf "spark.sql.hive.metastore.jars=path" --conf "spark.sql.hive.metastore.jars.path=file://op
Re: Spark 3.4.1 and Hive 3.1.3
Hi Sanket, While migrating to Hive 3.1.3 may resolve many issues, the link below suggests that there might still be some vulnerabilities present. Do you think the specific vulnerability you're concerned about can be addressed with Hive 3.1.3? https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3 Regards, Yasukazu 2023年9月8日(金) 12:36 Agrawal, Sanket : > Hi Chao, > > > > The reason to migrate to Hive 3.1.3 is to remove a vulnerability from > hive-exec-2.3.9.jar. > > > > Thanks > > Sanket > > > > *From:* Chao Sun > *Sent:* Thursday, September 7, 2023 10:23 PM > *To:* Agrawal, Sanket > *Cc:* Yeachan Park ; user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > Hi Sanket, > > > > Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a > lot of work to upgrade the Hive version to 3.x and up. > > > > Normally though, you only need the Hive client in Spark to talk to > HiveMetastore (HMS) for things like table or partition metadata > information. In this case, Hive 2.3.9 used by Spark is already capable of > communicating with HMS of other versions like Hive 3.x. So, could you share > a bit of context why you want to use Hive 3.1.3 with Spark? > > > > Chao > > > > > > On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket < > sankeagra...@deloitte.com.invalid> wrote: > > Hi > > > > I Tried using the maven option and it’s working. But we are not allowed to > download jars at runtime from maven because of some security restrictions. > > > > So, I tried again with downloading hive 3.1.3 and giving the location of > jars and it worked this time. But now in our docker image we have 40 new > Critical vulnerabilities due to Hive (scanned by AWS Inspector). > > > > So, The only solution I see here is to build *Spark 3.4.1* *with Hive > 3.1.3*. But when I do so the build is failing while compiling the files > in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with > Hive 2.3.9* the build is completed successfully. > > > > Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? > > > > Thanks, > > Sanket A. > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 8:52 PM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > What's the full traceback when you run the same thing via spark-shell? So > something like: > > > > $SPARK_HOME/bin/spark-shell \ >--conf "spark.sql.hive.metastore.version=3.1.3" \ >--conf "spark.sql.hive.metastore.jars=path" \ >--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" > > > > W.r.t building hive, there's no need - either download it from > https://downloads.apache.org/hive/hive-3.1.3/ > <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> > or use the maven option like Yasukazu suggested. If you do want to build it > make sure you are using Java 8 to do so. > > > > On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket > wrote: > > Hi, > > > > I tried pointing to hive 3.1.3 using the below command. But still getting > error. I see that the spark-hive-thriftserver_2.12/3.4.1 and > spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 > > > > Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf > "spark.sql.hive.metastore.jars=path" --conf > "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar" > > > > Error: > > > > > > Also, when I am trying to build spark with Hive 3.1.3 I am getting > following error. > > > > If anyone can give me some direction then it would of great help. > > > > Thanks, > > Sanket > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 1:32 AM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > Hi, > > > > Why not download/build the hive 3.1.3 bundle and tell Spark to
RE: Spark 3.4.1 and Hive 3.1.3
Hi Chao, The reason to migrate to Hive 3.1.3 is to remove a vulnerability from hive-exec-2.3.9.jar. Thanks Sanket From: Chao Sun Sent: Thursday, September 7, 2023 10:23 PM To: Agrawal, Sanket Cc: Yeachan Park ; user@spark.apache.org Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi Sanket, Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of work to upgrade the Hive version to 3.x and up. Normally though, you only need the Hive client in Spark to talk to HiveMetastore (HMS) for things like table or partition metadata information. In this case, Hive 2.3.9 used by Spark is already capable of communicating with HMS of other versions like Hive 3.x. So, could you share a bit of context why you want to use Hive 3.1.3 with Spark? Chao On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> wrote: Hi I Tried using the maven option and it’s working. But we are not allowed to download jars at runtime from maven because of some security restrictions. So, I tried again with downloading hive 3.1.3 and giving the location of jars and it worked this time. But now in our docker image we have 40 new Critical vulnerabilities due to Hive (scanned by AWS Inspector). So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But when I do so the build is failing while compiling the files in /spark/sql/hive. But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is completed successfully. Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? Thanks, Sanket A. From: Yeachan Park mailto:yeachan...@gmail.com>> Sent: Tuesday, September 5, 2023 8:52 PM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 What's the full traceback when you run the same thing via spark-shell? So something like: $SPARK_HOME/bin/spark-shell \ --conf "spark.sql.hive.metastore.version=3.1.3" \ --conf "spark.sql.hive.metastore.jars=path" \ --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" W.r.t building hive, there's no need - either download it from https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> or use the maven option like Yasukazu suggested. If you do want to build it make sure you are using Java 8 to do so. On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket mailto:sankeagra...@deloitte.com>> wrote: Hi, I tried pointing to hive 3.1.3 using the below command. But still getting error. I see that the spark-hive-thriftserver_2.12/3.4.1 and spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf "spark.sql.hive.metastore.jars=path" --conf "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar" Error: Also, when I am trying to build spark with Hive 3.1.3 I am getting following error. If anyone can give me some direction then it would of great help. Thanks, Sanket From: Yeachan Park mailto:yeachan...@gmail.com>> Sent: Tuesday, September 5, 2023 1:32 AM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi, Why not download/build the hive 3.1.3 bundle and tell Spark to use that? See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html<https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html> Basically, set: spark.sql.hive.metastore.version 3.1.3 spark.sql.hive.metastore.jars path spark.sql.hive.metastore.jars.path On Mon, Sep 4, 2023 at 7:42 PM Agrawal, Sanket mailto:s
Re: Spark 3.4.1 and Hive 3.1.3
Hi Sanket, Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of work to upgrade the Hive version to 3.x and up. Normally though, you only need the Hive client in Spark to talk to HiveMetastore (HMS) for things like table or partition metadata information. In this case, Hive 2.3.9 used by Spark is already capable of communicating with HMS of other versions like Hive 3.x. So, could you share a bit of context why you want to use Hive 3.1.3 with Spark? Chao On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket wrote: > Hi > > > > I Tried using the maven option and it’s working. But we are not allowed to > download jars at runtime from maven because of some security restrictions. > > > > So, I tried again with downloading hive 3.1.3 and giving the location of > jars and it worked this time. But now in our docker image we have 40 new > Critical vulnerabilities due to Hive (scanned by AWS Inspector). > > > > So, The only solution I see here is to build *Spark 3.4.1* *with Hive > 3.1.3*. But when I do so the build is failing while compiling the files > in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with > Hive 2.3.9* the build is completed successfully. > > > > Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? > > > > Thanks, > > Sanket A. > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 8:52 PM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > What's the full traceback when you run the same thing via spark-shell? So > something like: > > > > $SPARK_HOME/bin/spark-shell \ >--conf "spark.sql.hive.metastore.version=3.1.3" \ >--conf "spark.sql.hive.metastore.jars=path" \ >--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" > > > > W.r.t building hive, there's no need - either download it from > https://downloads.apache.org/hive/hive-3.1.3/ > <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> > or use the maven option like Yasukazu suggested. If you do want to build it > make sure you are using Java 8 to do so. > > > > On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket > wrote: > > Hi, > > > > I tried pointing to hive 3.1.3 using the below command. But still getting > error. I see that the spark-hive-thriftserver_2.12/3.4.1 and > spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 > > > > Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf > "spark.sql.hive.metastore.jars=path" --conf > "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar" > > > > Error: > > > > > > Also, when I am trying to build spark with Hive 3.1.3 I am getting > following error. > > > > If anyone can give me some direction then it would of great help. > > > > Thanks, > > Sanket > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 1:32 AM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > Hi, > > > > Why not download/build the hive 3.1.3 bundle and tell Spark to use that? > See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > <https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html> > > > > Basically, set: > > spark.sql.hive.metastore.version 3.1.3 > > spark.sql.hive.metastore.jars path > > spark.sql.hive.metastore.jars.path > > > > On Mon, Sep 4, 2
Re: Spark 3.4.1 and Hive 3.1.3
Hi, The maven option is good for testing but I wouldn't recommend it running in production from a security perspective and also depending on your setup you might be downloading jars at the start of every spark session. By the way, Spark definitely not require all the jars from Hive, since from you are only trying to connect to the metastore. Can you just try pointing spark.sql.hive.metastore.jars.path to the following jars from Hive 3.1.3: - hive-common-3.1.3.jar - hive-metastore-3.1.3.jar - hive-shims-common-3.1.3.jar On Thu, Sep 7, 2023 at 3:20 PM Agrawal, Sanket wrote: > Hi > > > > I Tried using the maven option and it’s working. But we are not allowed to > download jars at runtime from maven because of some security restrictions. > > > > So, I tried again with downloading hive 3.1.3 and giving the location of > jars and it worked this time. But now in our docker image we have 40 new > Critical vulnerabilities due to Hive (scanned by AWS Inspector). > > > > So, The only solution I see here is to build *Spark 3.4.1* *with Hive > 3.1.3*. But when I do so the build is failing while compiling the files > in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with > Hive 2.3.9* the build is completed successfully. > > > > Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? > > > > Thanks, > > Sanket A. > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 8:52 PM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > What's the full traceback when you run the same thing via spark-shell? So > something like: > > > > $SPARK_HOME/bin/spark-shell \ >--conf "spark.sql.hive.metastore.version=3.1.3" \ >--conf "spark.sql.hive.metastore.jars=path" \ >--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" > > > > W.r.t building hive, there's no need - either download it from > https://downloads.apache.org/hive/hive-3.1.3/ > <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> > or use the maven option like Yasukazu suggested. If you do want to build it > make sure you are using Java 8 to do so. > > > > On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket > wrote: > > Hi, > > > > I tried pointing to hive 3.1.3 using the below command. But still getting > error. I see that the spark-hive-thriftserver_2.12/3.4.1 and > spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 > > > > Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf > "spark.sql.hive.metastore.jars=path" --conf > "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar" > > > > Error: > > > > > > Also, when I am trying to build spark with Hive 3.1.3 I am getting > following error. > > > > If anyone can give me some direction then it would of great help. > > > > Thanks, > > Sanket > > > > *From:* Yeachan Park > *Sent:* Tuesday, September 5, 2023 1:32 AM > *To:* Agrawal, Sanket > *Cc:* user@spark.apache.org > *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3 > > > > Hi, > > > > Why not download/build the hive 3.1.3 bundle and tell Spark to use that? > See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html > <https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html> > > > > Basically, set: > > spark.sql.hive.metastore.version 3.1.3 > > spark.sql.hive.metastore.jars path > > spark.sql.hive.metastore.jars.path > > >
RE: Spark 3.4.1 and Hive 3.1.3
Hi I Tried using the maven option and it’s working. But we are not allowed to download jars at runtime from maven because of some security restrictions. So, I tried again with downloading hive 3.1.3 and giving the location of jars and it worked this time. But now in our docker image we have 40 new Critical vulnerabilities due to Hive (scanned by AWS Inspector). So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But when I do so the build is failing while compiling the files in /spark/sql/hive. But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is completed successfully. Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher? Thanks, Sanket A. From: Yeachan Park Sent: Tuesday, September 5, 2023 8:52 PM To: Agrawal, Sanket Cc: user@spark.apache.org Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 What's the full traceback when you run the same thing via spark-shell? So something like: $SPARK_HOME/bin/spark-shell \ --conf "spark.sql.hive.metastore.version=3.1.3" \ --conf "spark.sql.hive.metastore.jars=path" \ --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar" W.r.t building hive, there's no need - either download it from https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F> or use the maven option like Yasukazu suggested. If you do want to build it make sure you are using Java 8 to do so. On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket mailto:sankeagra...@deloitte.com>> wrote: Hi, I tried pointing to hive 3.1.3 using the below command. But still getting error. I see that the spark-hive-thriftserver_2.12/3.4.1 and spark-hive_2.12/3.4.1 have dependency on hive 2.3.9 Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf "spark.sql.hive.metastore.jars=path" --conf "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar" Error: [cid:image001.png@01D9E00D.CBDA2C50] Also, when I am trying to build spark with Hive 3.1.3 I am getting following error. [cid:image002.png@01D9E00D.CBDA2C50] If anyone can give me some direction then it would of great help. Thanks, Sanket From: Yeachan Park mailto:yeachan...@gmail.com>> Sent: Tuesday, September 5, 2023 1:32 AM To: Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3 Hi, Why not download/build the hive 3.1.3 bundle and tell Spark to use that? See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html<https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html> Basically, set: spark.sql.hive.metastore.version 3.1.3 spark.sql.hive.metastore.jars path spark.sql.hive.metastore.jars.path On Mon, Sep 4, 2023 at 7:42 PM Agrawal, Sanket mailto:sankeagra...@deloitte.com.invalid>> wrote: Hi, Has anyone tried building Spark 3.4.1 with Hive 3.1.3. I tried by making below changes in spark pom.xml but it’s failing. Pom.xml Error: Can anyone help me with the required configurations? Thanks, SA This message (including any attachments) contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and any disclosure, copying, or distribution of this message, or the taking of any action based on it, by you is strictly prohibited. Deloitte refers to a Deloitte member firm, one of its related entities, or Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a separate legal entity and a member of DTTL. DTTL does not provide services to clients. Please see www.deloitte.com/about<http://www.deloitte.com/about> to learn more. v.E.1