RE: Spark 3.4.1 and Hive 3.1.3

2023-09-08 Thread Agrawal, Sanket
Hi Yasukazu,

I tried by replacing the jar though the spark code didn’t work but the 
vulnerability was removed. But I agree that even 3.1.3 has other 
vulnerabilities listed on maven page but these are medium level 
vulnerabilities. We are currently targeting Critical and High vulnerabilities 
only.

Thank,
Sanket

From: Nagatomi Yasukazu 
Sent: Friday, September 8, 2023 9:35 AM
To: Agrawal, Sanket 
Cc: Chao Sun ; Yeachan Park ; 
user@spark.apache.org
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi Sanket,

While migrating to Hive 3.1.3 may resolve many issues, the link below suggests 
that there might still be some vulnerabilities present.
Do you think the specific vulnerability you're concerned about can be addressed 
with Hive 3.1.3?

https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3<https://secure-web.cisco.com/1d8xNpkOivsOWbg-kqI-qz9vedICxnHHvu0yw4y8v_SbNPrpvSNDikg5zip-GQVRwP3-PzJM9Pmf8BPmzFrm928Gtcil8BlyX6K9SWRYEPkvoy2mlszuyHJEBwMGJefJEFF7YgWG64_-eTUCqIESkh93X2QhW_dKBdlIHEdX5k77ywPE-w5EkeVz7EpZ2DVfepiqKNZDInUz9CmQn1C--P458DmITAO8NIcd1Nzdcy1CQZlQfAgO7pbbuw97mY3pj_EFraAZbFfElMRDa_a-4zGhAixYzTy9D4EqYzIwlXU1jh7_jT76jNGhRYOWyvdz1m1V3vDRLpjuXd4jW18HP-_HRUwk2krUoy1ihmoqpVSCmpQFEwI5ynOJ2s3IcT2GUGSSzRiPlYVaLO7gpKHYlEzXCpaUxNq2BcZITs2BcVoCSMFxYZzLItaBEVYLcbvanM82cNp01BksuX4DixDh-gdCDGXJQIQjqTIWXkCCFRcYbUkPJjw3GmkxRl2OGDn3AfJchJjSiT8Th7OU1Av4_Mg/https%3A%2F%2Fmvnrepository.com%2Fartifact%2Forg.apache.hive%2Fhive-exec%2F3.1.3>

Regards,
Yasukazu

2023年9月8日(金) 12:36 Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>:
Hi Chao,

The reason to migrate to Hive 3.1.3 is to remove a vulnerability from 
hive-exec-2.3.9.jar.

Thanks
Sanket

From: Chao Sun mailto:sunc...@apache.org>>
Sent: Thursday, September 7, 2023 10:23 PM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>
Cc: Yeachan Park mailto:yeachan...@gmail.com>>; 
user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi Sanket,

Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of 
work to upgrade the Hive version to 3.x and up.

Normally though, you only need the Hive client in Spark to talk to 
HiveMetastore (HMS) for things like table or partition metadata information. In 
this case, Hive 2.3.9 used by Spark is already capable of communicating with 
HMS of other versions like Hive 3.x. So, could you share a bit of context why 
you want to use Hive 3.1.3 with Spark?

Chao


On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>> 
wrote:
Hi

I Tried using the maven option and it’s working. But we are not allowed to 
download jars at runtime from maven because of some security restrictions.

So, I tried again with downloading hive 3.1.3 and giving the location of jars 
and it worked this time. But now in our docker image we have 40 new Critical 
vulnerabilities due to Hive (scanned by AWS Inspector).

So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But 
when I do so the build is failing while compiling the files in /spark/sql/hive. 
But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is 
completed successfully.

Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?

Thanks,
Sanket A.

From: Yeachan Park mailto:yeachan...@gmail.com>>
Sent: Tuesday, September 5, 2023 8:52 PM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

What's the full traceback when you run the same thing via spark-shell? So 
something like:

$SPARK_HOME/bin/spark-shell \
   --conf "spark.sql.hive.metastore.version=3.1.3" \
   --conf "spark.sql.hive.metastore.jars=path" \
   --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"

W.r.t building hive, there's no need - either download it from 
https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
 or use the maven option like Yasukazu suggested. If you do want to build it 
make sure you are using Java 8 to do so.

On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>> wrote:
Hi,

I tried pointing to hive 3.1.3 using the below command. But still getting 
error. I see that the spark-hive-thr

RE: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Agrawal, Sanket
Hi, I tried replacing just this JAR but getting errors.

From: Nagatomi Yasukazu 
Sent: Friday, September 8, 2023 9:35 AM
To: Agrawal, Sanket 
Cc: Chao Sun ; Yeachan Park ; 
user@spark.apache.org
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi Sanket,

While migrating to Hive 3.1.3 may resolve many issues, the link below suggests 
that there might still be some vulnerabilities present.
Do you think the specific vulnerability you're concerned about can be addressed 
with Hive 3.1.3?

https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3<https://secure-web.cisco.com/1d8xNpkOivsOWbg-kqI-qz9vedICxnHHvu0yw4y8v_SbNPrpvSNDikg5zip-GQVRwP3-PzJM9Pmf8BPmzFrm928Gtcil8BlyX6K9SWRYEPkvoy2mlszuyHJEBwMGJefJEFF7YgWG64_-eTUCqIESkh93X2QhW_dKBdlIHEdX5k77ywPE-w5EkeVz7EpZ2DVfepiqKNZDInUz9CmQn1C--P458DmITAO8NIcd1Nzdcy1CQZlQfAgO7pbbuw97mY3pj_EFraAZbFfElMRDa_a-4zGhAixYzTy9D4EqYzIwlXU1jh7_jT76jNGhRYOWyvdz1m1V3vDRLpjuXd4jW18HP-_HRUwk2krUoy1ihmoqpVSCmpQFEwI5ynOJ2s3IcT2GUGSSzRiPlYVaLO7gpKHYlEzXCpaUxNq2BcZITs2BcVoCSMFxYZzLItaBEVYLcbvanM82cNp01BksuX4DixDh-gdCDGXJQIQjqTIWXkCCFRcYbUkPJjw3GmkxRl2OGDn3AfJchJjSiT8Th7OU1Av4_Mg/https%3A%2F%2Fmvnrepository.com%2Fartifact%2Forg.apache.hive%2Fhive-exec%2F3.1.3>

Regards,
Yasukazu

2023年9月8日(金) 12:36 Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>:
Hi Chao,

The reason to migrate to Hive 3.1.3 is to remove a vulnerability from 
hive-exec-2.3.9.jar.

Thanks
Sanket

From: Chao Sun mailto:sunc...@apache.org>>
Sent: Thursday, September 7, 2023 10:23 PM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>
Cc: Yeachan Park mailto:yeachan...@gmail.com>>; 
user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi Sanket,

Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of 
work to upgrade the Hive version to 3.x and up.

Normally though, you only need the Hive client in Spark to talk to 
HiveMetastore (HMS) for things like table or partition metadata information. In 
this case, Hive 2.3.9 used by Spark is already capable of communicating with 
HMS of other versions like Hive 3.x. So, could you share a bit of context why 
you want to use Hive 3.1.3 with Spark?

Chao


On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>> 
wrote:
Hi

I Tried using the maven option and it’s working. But we are not allowed to 
download jars at runtime from maven because of some security restrictions.

So, I tried again with downloading hive 3.1.3 and giving the location of jars 
and it worked this time. But now in our docker image we have 40 new Critical 
vulnerabilities due to Hive (scanned by AWS Inspector).

So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But 
when I do so the build is failing while compiling the files in /spark/sql/hive. 
But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is 
completed successfully.

Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?

Thanks,
Sanket A.

From: Yeachan Park mailto:yeachan...@gmail.com>>
Sent: Tuesday, September 5, 2023 8:52 PM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

What's the full traceback when you run the same thing via spark-shell? So 
something like:

$SPARK_HOME/bin/spark-shell \
   --conf "spark.sql.hive.metastore.version=3.1.3" \
   --conf "spark.sql.hive.metastore.jars=path" \
   --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"

W.r.t building hive, there's no need - either download it from 
https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
 or use the maven option like Yasukazu suggested. If you do want to build it 
make sure you are using Java 8 to do so.

On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>> wrote:
Hi,

I tried pointing to hive 3.1.3 using the below command. But still getting 
error. I see that the spark-hive-thriftserver_2.12/3.4.1 and 
spark-hive_2.12/3.4.1 have dependency on hive 2.3.9

Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf 
"spark.sql.hive.metastore.jars=path" --conf 
"spark.sql.hive.metastore.jars.path=file://op

Re: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Nagatomi Yasukazu
Hi Sanket,

While migrating to Hive 3.1.3 may resolve many issues, the link below
suggests that there might still be some vulnerabilities present.
Do you think the specific vulnerability you're concerned about can be
addressed with Hive 3.1.3?

https://mvnrepository.com/artifact/org.apache.hive/hive-exec/3.1.3

Regards,
Yasukazu

2023年9月8日(金) 12:36 Agrawal, Sanket :

> Hi Chao,
>
>
>
> The reason to migrate to Hive 3.1.3 is to remove a vulnerability from
> hive-exec-2.3.9.jar.
>
>
>
> Thanks
>
> Sanket
>
>
>
> *From:* Chao Sun 
> *Sent:* Thursday, September 7, 2023 10:23 PM
> *To:* Agrawal, Sanket 
> *Cc:* Yeachan Park ; user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> Hi Sanket,
>
>
>
> Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a
> lot of work to upgrade the Hive version to 3.x and up.
>
>
>
> Normally though, you only need the Hive client in Spark to talk to
> HiveMetastore (HMS) for things like table or partition metadata
> information. In this case, Hive 2.3.9 used by Spark is already capable of
> communicating with HMS of other versions like Hive 3.x. So, could you share
> a bit of context why you want to use Hive 3.1.3 with Spark?
>
>
>
> Chao
>
>
>
>
>
> On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket <
> sankeagra...@deloitte.com.invalid> wrote:
>
> Hi
>
>
>
> I Tried using the maven option and it’s working. But we are not allowed to
> download jars at runtime from maven because of some security restrictions.
>
>
>
> So, I tried again with downloading hive 3.1.3 and giving the location of
> jars and it worked this time. But now in our docker image we have 40 new
> Critical vulnerabilities due to Hive (scanned by AWS Inspector).
>
>
>
> So, The only solution I see here is to build *Spark 3.4.1* *with Hive
> 3.1.3*. But when I do so the build is failing while compiling the files
> in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with
> Hive 2.3.9* the build is completed successfully.
>
>
>
> Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?
>
>
>
> Thanks,
>
> Sanket A.
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 8:52 PM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> What's the full traceback when you run the same thing via spark-shell? So
> something like:
>
>
>
> $SPARK_HOME/bin/spark-shell \
>--conf "spark.sql.hive.metastore.version=3.1.3" \
>--conf "spark.sql.hive.metastore.jars=path" \
>--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"
>
>
>
> W.r.t building hive, there's no need - either download it from
> https://downloads.apache.org/hive/hive-3.1.3/
> <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
> or use the maven option like Yasukazu suggested. If you do want to build it
> make sure you are using Java 8 to do so.
>
>
>
> On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
> wrote:
>
> Hi,
>
>
>
> I tried pointing to hive 3.1.3 using the below command. But still getting
> error. I see that the spark-hive-thriftserver_2.12/3.4.1 and
> spark-hive_2.12/3.4.1 have dependency on hive 2.3.9
>
>
>
> Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf
> "spark.sql.hive.metastore.jars=path" --conf
> "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar"
>
>
>
> Error:
>
>
>
>
>
> Also, when I am trying to build spark with Hive 3.1.3 I am getting
> following error.
>
>
>
> If anyone can give me some direction then it would of great help.
>
>
>
> Thanks,
>
> Sanket
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 1:32 AM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> Hi,
>
>
>
> Why not download/build the hive 3.1.3 bundle and tell Spark to 

RE: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Agrawal, Sanket
Hi Chao,

The reason to migrate to Hive 3.1.3 is to remove a vulnerability from 
hive-exec-2.3.9.jar.

Thanks
Sanket

From: Chao Sun 
Sent: Thursday, September 7, 2023 10:23 PM
To: Agrawal, Sanket 
Cc: Yeachan Park ; user@spark.apache.org
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi Sanket,

Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a lot of 
work to upgrade the Hive version to 3.x and up.

Normally though, you only need the Hive client in Spark to talk to 
HiveMetastore (HMS) for things like table or partition metadata information. In 
this case, Hive 2.3.9 used by Spark is already capable of communicating with 
HMS of other versions like Hive 3.x. So, could you share a bit of context why 
you want to use Hive 3.1.3 with Spark?

Chao


On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>> 
wrote:
Hi

I Tried using the maven option and it’s working. But we are not allowed to 
download jars at runtime from maven because of some security restrictions.

So, I tried again with downloading hive 3.1.3 and giving the location of jars 
and it worked this time. But now in our docker image we have 40 new Critical 
vulnerabilities due to Hive (scanned by AWS Inspector).

So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But 
when I do so the build is failing while compiling the files in /spark/sql/hive. 
But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is 
completed successfully.

Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?

Thanks,
Sanket A.

From: Yeachan Park mailto:yeachan...@gmail.com>>
Sent: Tuesday, September 5, 2023 8:52 PM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

What's the full traceback when you run the same thing via spark-shell? So 
something like:

$SPARK_HOME/bin/spark-shell \
   --conf "spark.sql.hive.metastore.version=3.1.3" \
   --conf "spark.sql.hive.metastore.jars=path" \
   --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"

W.r.t building hive, there's no need - either download it from 
https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
 or use the maven option like Yasukazu suggested. If you do want to build it 
make sure you are using Java 8 to do so.

On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>> wrote:
Hi,

I tried pointing to hive 3.1.3 using the below command. But still getting 
error. I see that the spark-hive-thriftserver_2.12/3.4.1 and 
spark-hive_2.12/3.4.1 have dependency on hive 2.3.9

Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf 
"spark.sql.hive.metastore.jars=path" --conf 
"spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar"

Error:


Also, when I am trying to build spark with Hive 3.1.3 I am getting following 
error.

If anyone can give me some direction then it would of great help.

Thanks,
Sanket

From: Yeachan Park mailto:yeachan...@gmail.com>>
Sent: Tuesday, September 5, 2023 1:32 AM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi,

Why not download/build the hive 3.1.3 bundle and tell Spark to use that? See 
https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html<https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html>

Basically, set:
spark.sql.hive.metastore.version 3.1.3
spark.sql.hive.metastore.jars path
spark.sql.hive.metastore.jars.path 

On Mon, Sep 4, 2023 at 7:42 PM Agrawal, Sanket 
mailto:s

Re: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Chao Sun
Hi Sanket,

Spark 3.4.1 currently only works with Hive 2.3.9, and it would require a
lot of work to upgrade the Hive version to 3.x and up.

Normally though, you only need the Hive client in Spark to talk to
HiveMetastore (HMS) for things like table or partition metadata
information. In this case, Hive 2.3.9 used by Spark is already capable of
communicating with HMS of other versions like Hive 3.x. So, could you share
a bit of context why you want to use Hive 3.1.3 with Spark?

Chao


On Thu, Sep 7, 2023 at 6:22 AM Agrawal, Sanket
 wrote:

> Hi
>
>
>
> I Tried using the maven option and it’s working. But we are not allowed to
> download jars at runtime from maven because of some security restrictions.
>
>
>
> So, I tried again with downloading hive 3.1.3 and giving the location of
> jars and it worked this time. But now in our docker image we have 40 new
> Critical vulnerabilities due to Hive (scanned by AWS Inspector).
>
>
>
> So, The only solution I see here is to build *Spark 3.4.1* *with Hive
> 3.1.3*. But when I do so the build is failing while compiling the files
> in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with
> Hive 2.3.9* the build is completed successfully.
>
>
>
> Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?
>
>
>
> Thanks,
>
> Sanket A.
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 8:52 PM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> What's the full traceback when you run the same thing via spark-shell? So
> something like:
>
>
>
> $SPARK_HOME/bin/spark-shell \
>--conf "spark.sql.hive.metastore.version=3.1.3" \
>--conf "spark.sql.hive.metastore.jars=path" \
>--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"
>
>
>
> W.r.t building hive, there's no need - either download it from
> https://downloads.apache.org/hive/hive-3.1.3/
> <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
> or use the maven option like Yasukazu suggested. If you do want to build it
> make sure you are using Java 8 to do so.
>
>
>
> On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
> wrote:
>
> Hi,
>
>
>
> I tried pointing to hive 3.1.3 using the below command. But still getting
> error. I see that the spark-hive-thriftserver_2.12/3.4.1 and
> spark-hive_2.12/3.4.1 have dependency on hive 2.3.9
>
>
>
> Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf
> "spark.sql.hive.metastore.jars=path" --conf
> "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar"
>
>
>
> Error:
>
>
>
>
>
> Also, when I am trying to build spark with Hive 3.1.3 I am getting
> following error.
>
>
>
> If anyone can give me some direction then it would of great help.
>
>
>
> Thanks,
>
> Sanket
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 1:32 AM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> Hi,
>
>
>
> Why not download/build the hive 3.1.3 bundle and tell Spark to use that?
> See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html
> <https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html>
>
>
>
> Basically, set:
>
> spark.sql.hive.metastore.version 3.1.3
>
> spark.sql.hive.metastore.jars path
>
> spark.sql.hive.metastore.jars.path 
>
>
>
> On Mon, Sep 4, 2

Re: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Yeachan Park
Hi,

The maven option is good for testing but I wouldn't recommend it running in
production from a security perspective and also depending on your setup you
might be downloading jars at the start of every spark session.

By the way, Spark definitely not require all the jars from Hive, since from
you are only trying to connect to the metastore. Can you just try pointing
spark.sql.hive.metastore.jars.path to the following jars from Hive 3.1.3:
- hive-common-3.1.3.jar
- hive-metastore-3.1.3.jar
- hive-shims-common-3.1.3.jar

On Thu, Sep 7, 2023 at 3:20 PM Agrawal, Sanket 
wrote:

> Hi
>
>
>
> I Tried using the maven option and it’s working. But we are not allowed to
> download jars at runtime from maven because of some security restrictions.
>
>
>
> So, I tried again with downloading hive 3.1.3 and giving the location of
> jars and it worked this time. But now in our docker image we have 40 new
> Critical vulnerabilities due to Hive (scanned by AWS Inspector).
>
>
>
> So, The only solution I see here is to build *Spark 3.4.1* *with Hive
> 3.1.3*. But when I do so the build is failing while compiling the files
> in /spark/sql/hive. But when I am trying to build *Spark 3.4.1* *with
> Hive 2.3.9* the build is completed successfully.
>
>
>
> Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?
>
>
>
> Thanks,
>
> Sanket A.
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 8:52 PM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> What's the full traceback when you run the same thing via spark-shell? So
> something like:
>
>
>
> $SPARK_HOME/bin/spark-shell \
>--conf "spark.sql.hive.metastore.version=3.1.3" \
>--conf "spark.sql.hive.metastore.jars=path" \
>--conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"
>
>
>
> W.r.t building hive, there's no need - either download it from
> https://downloads.apache.org/hive/hive-3.1.3/
> <https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
> or use the maven option like Yasukazu suggested. If you do want to build it
> make sure you are using Java 8 to do so.
>
>
>
> On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
> wrote:
>
> Hi,
>
>
>
> I tried pointing to hive 3.1.3 using the below command. But still getting
> error. I see that the spark-hive-thriftserver_2.12/3.4.1 and
> spark-hive_2.12/3.4.1 have dependency on hive 2.3.9
>
>
>
> Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf
> "spark.sql.hive.metastore.jars=path" --conf
> "spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar"
>
>
>
> Error:
>
>
>
>
>
> Also, when I am trying to build spark with Hive 3.1.3 I am getting
> following error.
>
>
>
> If anyone can give me some direction then it would of great help.
>
>
>
> Thanks,
>
> Sanket
>
>
>
> *From:* Yeachan Park 
> *Sent:* Tuesday, September 5, 2023 1:32 AM
> *To:* Agrawal, Sanket 
> *Cc:* user@spark.apache.org
> *Subject:* [EXT] Re: Spark 3.4.1 and Hive 3.1.3
>
>
>
> Hi,
>
>
>
> Why not download/build the hive 3.1.3 bundle and tell Spark to use that?
> See https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html
> <https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html>
>
>
>
> Basically, set:
>
> spark.sql.hive.metastore.version 3.1.3
>
> spark.sql.hive.metastore.jars path
>
> spark.sql.hive.metastore.jars.path 
>
>
>

RE: Spark 3.4.1 and Hive 3.1.3

2023-09-07 Thread Agrawal, Sanket
Hi

I Tried using the maven option and it’s working. But we are not allowed to 
download jars at runtime from maven because of some security restrictions.

So, I tried again with downloading hive 3.1.3 and giving the location of jars 
and it worked this time. But now in our docker image we have 40 new Critical 
vulnerabilities due to Hive (scanned by AWS Inspector).

So, The only solution I see here is to build Spark 3.4.1 with Hive 3.1.3. But 
when I do so the build is failing while compiling the files in /spark/sql/hive. 
But when I am trying to build Spark 3.4.1 with Hive 2.3.9 the build is 
completed successfully.

Has anyone tried building Spark 3.4.1 with Hive 3.1.3 or higher?

Thanks,
Sanket A.

From: Yeachan Park 
Sent: Tuesday, September 5, 2023 8:52 PM
To: Agrawal, Sanket 
Cc: user@spark.apache.org
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

What's the full traceback when you run the same thing via spark-shell? So 
something like:

$SPARK_HOME/bin/spark-shell \
   --conf "spark.sql.hive.metastore.version=3.1.3" \
   --conf "spark.sql.hive.metastore.jars=path" \
   --conf "spark.sql.hive.metastore.jars.path=/opt/hive/lib/*.jar"

W.r.t building hive, there's no need - either download it from 
https://downloads.apache.org/hive/hive-3.1.3/<https://secure-web.cisco.com/1IsuHM1ALR8L3m2ZVx4VbSWxlL34thDBf_dHELqydfQIj7R90KvNhGSEkXqyHXmOfSenFAtnuzzarKHiNMbSqX72Kh4feX6b6QpNP16REgegIZLutUZ_MJcQ_CPPCNre-OeveW0hgCfi_nmR5aLeG-SGHSeTfMF42qJd4xndcM5FFxQe4Tfg8gAP2UCVxyvhQut40U9xDaIjcJD5_IT1y7whzw4xcxp2s_lhL7VAEBHOrWdMTG2MI8qdm7HzyE_By32O6XDkc0YaMdQLcAomZ5l5Ssp0DKwoVMntgNZe_adWv-yvUSNuwpqb-af55AjSgXf3Vy2ajVN0tBPY2Li_igjTilrrRoKugtNZsaOTpx3Ex5RUFdu0g2TK8bombxiVsncFiGVvmvOCCewuE-dEV44b6EveOyoqNcbE6AHgI9-6wcy5qtrScU5wruVO6z3_-tvpH26RFVw7fYla-mMeqX2PLhsnwqFvcU1lRc0Hiq9J93VmLyr3Y-mDYKqlFUL6EqRGhT7hY9Szurj5BSHzoDw/https%3A%2F%2Fdownloads.apache.org%2Fhive%2Fhive-3.1.3%2F>
 or use the maven option like Yasukazu suggested. If you do want to build it 
make sure you are using Java 8 to do so.

On Tue, Sep 5, 2023 at 12:00 PM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com>> wrote:
Hi,

I tried pointing to hive 3.1.3 using the below command. But still getting 
error. I see that the spark-hive-thriftserver_2.12/3.4.1 and 
spark-hive_2.12/3.4.1 have dependency on hive 2.3.9

Command: pyspark --conf "spark.sql.hive.metastore.version=3.1.3" --conf 
"spark.sql.hive.metastore.jars=path" --conf 
"spark.sql.hive.metastore.jars.path=file://opt/hive/lib/*.jar"

Error:
[cid:image001.png@01D9E00D.CBDA2C50]


Also, when I am trying to build spark with Hive 3.1.3 I am getting following 
error.
[cid:image002.png@01D9E00D.CBDA2C50]

If anyone can give me some direction then it would of great help.

Thanks,
Sanket

From: Yeachan Park mailto:yeachan...@gmail.com>>
Sent: Tuesday, September 5, 2023 1:32 AM
To: Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: [EXT] Re: Spark 3.4.1 and Hive 3.1.3

Hi,

Why not download/build the hive 3.1.3 bundle and tell Spark to use that? See 
https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html<https://secure-web.cisco.com/1v3sayeGSek80FVpl91pY_-yK4C4shE3LRRtmMS7Th08V9Ka4HMy009tMFFeXWLOUKRl2IpbeIxoppyDNCEYm7q9QnWfObVCCq-DJEmdtoG4XJGEioT8hpRCd0PTuXEai8_zSk-2RzByf5ksuQ49_QPiGQi-33wS3GvNkBIjWvPuZstPHlxYrhYzOqZU2xIzhtr_VbqkG0N-7YHs1O8dyR-Xomli8_SgFh-RPPUuwb5nH-Yj-Ro6FTJ0hRlnOjmvu6c9im6V2WPg6rmWXr7KuN2zyuxzzsxr0fOYKgLLhSKUNi__wZY9jfzKlLalS88DZKx5fkK15vfWW-FTULz20KGDETRmLryWFZDaeTgYyDQ1-fqR-9G4IaVOvj9DmXRqkYRlMWE1n3Jq8BOABFJoyOdJF5RE3irkrOOYdk2Q5ip_qCwtd6qMKQH-QqlyNqWdbrGS1xPKdP1lZv25dJJ7KsM7kbO8eqyKlk0YJp5C1mVPZr4UfFu885lNXi-6D-3eudTU6B5m3-ynoieZC94eUGw/https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-hive-tables.html>

Basically, set:
spark.sql.hive.metastore.version 3.1.3
spark.sql.hive.metastore.jars path
spark.sql.hive.metastore.jars.path 

On Mon, Sep 4, 2023 at 7:42 PM Agrawal, Sanket 
mailto:sankeagra...@deloitte.com.invalid>> 
wrote:
Hi,

Has anyone tried building Spark 3.4.1 with Hive 3.1.3. I tried by making below 
changes in spark pom.xml but it’s failing.

Pom.xml

Error:

Can anyone help me with the required configurations?

Thanks,
SA

This message (including any attachments) contains confidential information 
intended for a specific individual and purpose, and is protected by law. If you 
are not the intended recipient, you should delete this message and any 
disclosure, copying, or distribution of this message, or the taking of any 
action based on it, by you is strictly prohibited.

Deloitte refers to a Deloitte member firm, one of its related entities, or 
Deloitte Touche Tohmatsu Limited ("DTTL"). Each Deloitte member firm is a 
separate legal entity and a member of DTTL. DTTL does not provide services to 
clients. Please see www.deloitte.com/about<http://www.deloitte.com/about> to 
learn more.

v.E.1