Re:Re: Unable to create view due to up cast error when migrating from Hive to Spark
Thank you for the reply ! At 2022-05-18 20:27:27, "Wenchen Fan" wrote: A view is essentially a SQL query. It's fragile to share views between Spark and Hive because different systems have different SQL dialects. They may interpret the view SQL query differently and introduce unexpected behaviors. In this case, Spark returns decimal type for gender * 0.3 - 0.1 but Hive returns double type. The view schema was determined during creation by Hive, which does not match the view SQL query when we use Spark to read the view. We need to re-create this view using Spark. Actually I think we need to do the same for every Hive view if we need to use it in Spark. On Wed, May 18, 2022 at 7:03 PM beliefer wrote: During the migration from hive to spark, there was a problem with the SQL used to create views in hive. The problem is that the SQL that legally creates a view in hive will make an error when executed in spark SQL. The SQL is as follows: CREATE VIEW test_db.my_view AS select case when age > 12 then gender * 0.3 - 0.1 end AS TT, gender, age, careers, education from test_db.my_table; The error message is as follows: Cannot up cast TT from decimal(13, 1) to double. The type path of the target object is: You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object How should we solve this problem?
Re: Unable to create view due to up cast error when migrating from Hive to Spark
A view is essentially a SQL query. It's fragile to share views between Spark and Hive because different systems have different SQL dialects. They may interpret the view SQL query differently and introduce unexpected behaviors. In this case, Spark returns decimal type for gender * 0.3 - 0.1 but Hive returns double type. The view schema was determined during creation by Hive, which does not match the view SQL query when we use Spark to read the view. We need to re-create this view using Spark. Actually I think we need to do the same for every Hive view if we need to use it in Spark. On Wed, May 18, 2022 at 7:03 PM beliefer wrote: > During the migration from hive to spark, there was a problem with the SQL > used to create views in hive. The problem is that the SQL that legally > creates a view in hive will make an error when executed in spark SQL. > > The SQL is as follows: > > CREATE VIEW test_db.my_view AS > select > case > when age > 12 then gender * 0.3 - 0.1 > end AS TT, > gender, > age, > careers, > education > from > test_db.my_table; > > The error message is as follows: > > Cannot up cast TT from decimal(13, 1) to double. > The type path of the target object is: > > You can either add an explicit cast to the input data or choose a higher > precision type of the field in the target object > > *How should we solve this problem?* > > > >
Unable to create view due to up cast error when migrating from Hive to Spark
During the migration from hive to spark, there was a problem with the SQL used to create views in hive. The problem is that the SQL that legally creates a view in hive will make an error when executed in spark SQL. The SQL is as follows: CREATE VIEW test_db.my_view AS select case when age > 12 then gender * 0.3 - 0.1 end AS TT, gender, age, careers, education from test_db.my_table; The error message is as follows: Cannot up cast TT from decimal(13, 1) to double. The type path of the target object is: You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object How should we solve this problem?
Unable to create view due to up cast error when migrating from Hive to Spark
During the migration from Hive to spark, there was a problem when the view created in Hive was used in Spark SQL. The origin Hive SQL show below: CREATE VIEW myView AS SELECT CASE WHEN age > 12 THEN CAST(gender * 0.3 - 0.1 AS double) END AS TT, gender, age FROM myTable; Users use Spark SQL to query the view, but encountered up cast error. The error message is as follows: Cannot up cast TT from decimal(13, 1) to double. The type path of the target object is: You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object How should we solve this problem?
Unable to create view due to up cast error when migrating from Hive to Spark
During the migration from hive to spark, there was a problem with the SQL used to create views in hive. The problem is that the SQL that legally creates a view in hive will make an error when executed in spark SQL. The SQL is as follows: CREATE VIEW myView AS SELECT CASE WHEN age > 12 THEN CAST(gender * 0.3 - 0.1 AS double) END AS TT, gender, age FROM myTable; The error message is as follows: Cannot up cast TT from decimal(13, 1) to double. The type path of the target object is: You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object How should we solve this problem?
Re: Migrating from hive to spark
Ok the first link throws some clues .*... Hive excels in batch disc processing with a map reduce execution engine. Actually, Hive can also use Spark as its execution engine which also has a Hive context allowing us to query Hive tables. Despite all the great things Hive can solve, this post is to talk about why we move our ETL’s to the ‘not so new’ player for batch processing, ...* Great, you want to use Spark for ETL as opposed to Hive for cleaning up your data once your upstream CDC files are landed on HDF? correct view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Thu, 17 Jun 2021 at 08:17, Battula, Brahma Reddy wrote: > Hi Talebzadeh, > > > > Looks I confused, Sorry.. Now I changed to subject to make it clear. > > Facebook has tried migration from hive to spark. Check the following links > for same. > > > > *https://www.dcsl.com/migrating-from-hive-to-spark/ > <https://www.dcsl.com/migrating-from-hive-to-spark/>* > > > https://databricks.com/session/experiences-migrating-hive-workload-to-sparksql > > https://www.cloudwalker.io/2019/02/19/spark-ad-hoc-querying/ > > > > > > would like to know, like this anybody else migrated..? and any challenges > or pre-requisite to migrate(Like hardware)..? any tools to evaluate before > we migrate? > > > > > > > > > > *From: *Mich Talebzadeh > *Date: *Tuesday, 15 June 2021 at 10:36 PM > *To: *Battula, Brahma Reddy > *Cc: *Battula, Brahma Reddy , ayan guha < > guha.a...@gmail.com>, dev@spark.apache.org , > u...@spark.apache.org > *Subject: *Re: Spark-sql can replace Hive ? > > OK you mean use spark.sql as opposed to HiveContext.sql? > > > > val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > > HiveContext.sql("") > > > > replace with > > > > spark.sql("") > > ? > > > > >view my Linkedin profile > <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F=04%7C01%7Cbbattula%40visa.com%7C3bb528ad53c8445e7dde08d9301fdf30%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637593735708866891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=zJHaQrxmha3ZZxsUntvBjjwhbcFsfr92Hy1B5a%2FoFmw%3D=0> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > > > > On Tue, 15 Jun 2021 at 18:00, Battula, Brahma Reddy > wrote: > > Currently I am using hive sql engine for adhoc queries. As spark-sql also > supports this, I want migrate from hive. > > > > > > > > > > *From: *Mich Talebzadeh > *Date: *Thursday, 10 June 2021 at 8:12 PM > *To: *Battula, Brahma Reddy > *Cc: *ayan guha , dev@spark.apache.org < > dev@spark.apache.org>, u...@spark.apache.org > *Subject: *Re: Spark-sql can replace Hive ? > > These are different things. Spark provides a computational layer and a > dialogue of SQL based on Hive. > > > > Hive is a DW on top of HDFS. What are you trying to replace? > > > > HTH > > > > > > >view my Linkedin profile > <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F=04%7C01%7Cbbattula%40visa.com%7C3bb528ad53c8445e7dde08d9301fdf30%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637593735708876847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=lo9URWG2yavrcQbWpp7VjHcb16wLtE9DW%2FBX%2BhYjYtE%3D=0> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > > > > On Thu, 10 Jun 2021 at 12:09, Battula, Brahma Reddy > wrote: > > Thanks for prompt reply. > > > > I want to replace hive with spark.
Migrating from hive to spark
Hi Talebzadeh, Looks I confused, Sorry.. Now I changed to subject to make it clear. Facebook has tried migration from hive to spark. Check the following links for same. https://www.dcsl.com/migrating-from-hive-to-spark/ https://databricks.com/session/experiences-migrating-hive-workload-to-sparksql https://www.cloudwalker.io/2019/02/19/spark-ad-hoc-querying/ would like to know, like this anybody else migrated..? and any challenges or pre-requisite to migrate(Like hardware)..? any tools to evaluate before we migrate? From: Mich Talebzadeh Date: Tuesday, 15 June 2021 at 10:36 PM To: Battula, Brahma Reddy Cc: Battula, Brahma Reddy , ayan guha , dev@spark.apache.org , u...@spark.apache.org Subject: Re: Spark-sql can replace Hive ? OK you mean use spark.sql as opposed to HiveContext.sql? val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) HiveContext.sql("") replace with spark.sql("") ? [https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F=04%7C01%7Cbbattula%40visa.com%7C3bb528ad53c8445e7dde08d9301fdf30%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637593735708866891%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=zJHaQrxmha3ZZxsUntvBjjwhbcFsfr92Hy1B5a%2FoFmw%3D=0> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Tue, 15 Jun 2021 at 18:00, Battula, Brahma Reddy mailto:bbatt...@visa.com>> wrote: Currently I am using hive sql engine for adhoc queries. As spark-sql also supports this, I want migrate from hive. From: Mich Talebzadeh mailto:mich.talebza...@gmail.com>> Date: Thursday, 10 June 2021 at 8:12 PM To: Battula, Brahma Reddy Cc: ayan guha mailto:guha.a...@gmail.com>>, dev@spark.apache.org<mailto:dev@spark.apache.org> mailto:dev@spark.apache.org>>, u...@spark.apache.org<mailto:u...@spark.apache.org> mailto:u...@spark.apache.org>> Subject: Re: Spark-sql can replace Hive ? These are different things. Spark provides a computational layer and a dialogue of SQL based on Hive. Hive is a DW on top of HDFS. What are you trying to replace? HTH [https://docs.google.com/uc?export=download=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ] view my Linkedin profile<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fin%2Fmich-talebzadeh-ph-d-5205b2%2F=04%7C01%7Cbbattula%40visa.com%7C3bb528ad53c8445e7dde08d9301fdf30%7C38305e12e15d4ee888b9c4db1c477d76%7C0%7C0%7C637593735708876847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=lo9URWG2yavrcQbWpp7VjHcb16wLtE9DW%2FBX%2BhYjYtE%3D=0> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Thu, 10 Jun 2021 at 12:09, Battula, Brahma Reddy wrote: Thanks for prompt reply. I want to replace hive with spark. From: ayan guha mailto:guha.a...@gmail.com>> Date: Thursday, 10 June 2021 at 4:35 PM To: Battula, Brahma Reddy Cc: dev@spark.apache.org<mailto:dev@spark.apache.org> mailto:dev@spark.apache.org>>, u...@spark.apache.org<mailto:u...@spark.apache.org> mailto:u...@spark.apache.org>> Subject: Re: Spark-sql can replace Hive ? Would you mind expanding the ask? Spark Sql can use hive by itaelf On Thu, 10 Jun 2021 at 8:58 pm, Battula, Brahma Reddy wrote: Hi Would like know any refences/docs to replace hive with spark-sql completely like how migrate the existing data in hive.? thanks -- Best Regards, Ayan Guha