I have created it using Spark SQL. Then I want to retrieve from HIVE. Thats where the issue is. I can , still retrieve from Spark. No problems. Why HIVE is not giving me the data??
On Sun, Feb 11, 2018 at 3:06 AM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > simple question have you created the table through spark sql or hive? > > I recall similar issues a while back. > > val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > //val sqlContext = new HiveContext(sc) > println ("\nStarted at"); spark.sql("SELECT FROM_unixtime(unix_timestamp(), > 'dd/MM/yyyy HH:mm:ss.ss') ").collect.foreach(println) > // > // Need to create and populate target ORC table sales in database test in > Hive > // > HiveContext.sql("use test") > // > // Drop and create table > // > HiveContext.sql("DROP TABLE IF EXISTS test.dummy2") > var sqltext = "" > sqltext = """ > CREATE TABLE test.dummy2 > ( > ID INT > , CLUSTERED INT > , SCATTERED INT > , RANDOMISED INT > , RANDOM_STRING VARCHAR(50) > , SMALL_VC VARCHAR(10) > , PADDING VARCHAR(10) > ) > --CLUSTERED BY (ID) INTO 256 BUCKETS > STORED AS ORC > TBLPROPERTIES ( "orc.compress"="SNAPPY", > "orc.create.index"="true", > "orc.bloom.filter.columns"="ID", > "orc.bloom.filter.fpp"="0.05", > "orc.stripe.size"="268435456", > "orc.row.index.stride"="10000" ) > """ > HiveContext.sql(sqltext) > // > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 11 February 2018 at 07:54, ☼ R Nair (रविशंकर नायर) < > ravishankar.n...@gmail.com> wrote: > >> Hi, >> Here you go: >> >> hive> show create table mine; >> OK >> CREATE TABLE `mine`( >> `policyid` int, >> `statecode` string, >> `socialid` string, >> `county` string, >> `eq_site_limit` decimal(10,2), >> `hu_site_limit` decimal(10,2), >> `fl_site_limit` decimal(10,2), >> `fr_site_limit` decimal(10,2), >> `tiv_2014` decimal(10,2), >> `tiv_2015` decimal(10,2), >> `eq_site_deductible` int, >> `hu_site_deductible` int, >> `fl_site_deductible` int, >> `fr_site_deductible` int, >> `latitude` decimal(6,6), >> `longitude` decimal(6,6), >> `line` string, >> `construction` string, >> `point_granularity` int) >> ROW FORMAT SERDE >> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' >> WITH SERDEPROPERTIES ( >> 'path'='hdfs://localhost:8020/user/hive/warehouse/mine') >> STORED AS INPUTFORMAT >> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' >> OUTPUTFORMAT >> 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' >> LOCATION >> 'file:/Users/ravishankarnair/spark-warehouse/mine' >> TBLPROPERTIES ( >> 'spark.sql.sources.provider'='parquet', >> 'spark.sql.sources.schema.numParts'='1', >> 'spark.sql.sources.schema.part.0'='{\"type\":\"struct\",\" >> fields\":[{\"name\":\"policyid\",\"type\":\"integer\",\" >> nullable\":true,\"metadata\":{\"name\":\"policyid\",\"scale\ >> ":0}},{\"name\":\"statecode\",\"type\":\"string\",\" >> nullable\":true,\"metadata\":{\"name\":\"statecode\",\" >> scale\":0}},{\"name\":\"Socialid\",\"type\":\"string\" >> ,\"nullable\":true,\"metadata\":{\"name\":\"Socialid\",\" >> scale\":0}},{\"name\":\"county\",\"type\":\"string\",\ >> "nullable\":true,\"metadata\":{\"name\":\"county\",\"scale\" >> :0}},{\"name\":\"eq_site_limit\",\"type\":\"decimal(10, >> 2)\",\"nullable\":true,\"metadata\":{\"name\":\"eq_ >> site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\",\"typ >> e\":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\" >> name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_site_ >> limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\" >> metadata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\" >> name\":\"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullab >> le\":true,\"metadata\":{\"name\":\"fr_site_limit\",\"sca >> le\":2}},{\"name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\ >> "nullable\":true,\"metadata\":{\"name\":\"tiv_2014\",\" >> scale\":2}},{\"name\":\"tiv_2015\",\"type\":\"decimal(10, >> 2)\",\"nullable\":true,\"metadata\":{\"name\":\"tiv_ >> 2015\",\"scale\":2}},{\"name\":\"eq_site_deductible\",\" >> type\":\"integer\",\"nullable\":true,\"metadata\":{\"name\": >> \"eq_site_deductible\",\"scale\":0}},{\"name\":\"hu_ >> site_deductible\",\"type\":\"integer\",\"nullable\":true,\" >> metadata\":{\"name\":\"hu_site_deductible\",\"scale\":0}},{\ >> "name\":\"fl_site_deductible\",\"type\":\"integer\",\" >> nullable\":true,\"metadata\":{\"name\":\"fl_site_deductible\ >> ",\"scale\":0}},{\"name\":\"fr_site_deductible\",\"type\": >> \"integer\",\"nullable\":true,\"metadata\":{\"name\":\"fr_si >> te_deductible\",\"scale\":0}},{\"name\":\"latitude\",\"type\ >> ":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\": >> \"latitude\",\"scale\":6}},{\"name\":\"longitude\",\"type\": >> \"decimal(6,6)\",\"nullable\":true,\"metadata\":{\"name\":\" >> longitude\",\"scale\":6}},{\"name\":\"line\",\"type\":\" >> string\",\"nullable\":true,\"metadata\":{\"name\":\"line\", >> \"scale\":0}},{\"name\":\"construction\",\"type\":\" >> string\",\"nullable\":true,\"metadata\":{\"name\":\" >> construction\",\"scale\":0}},{\"name\":\"point_granularity\" >> ,\"type\":\"integer\",\"nullable\":true,\"metadata\":{ >> \"name\":\"point_granularity\",\"scale\":0}}]}', >> 'transient_lastDdlTime'='1518335598') >> Time taken: 0.13 seconds, Fetched: 35 row(s) >> >> On Sun, Feb 11, 2018 at 2:36 AM, Shmuel Blitz < >> shmuel.bl...@similarweb.com> wrote: >> >>> Please run the following command, and paste the result: >>> SHOW CREATE TABLE <<TABLE-NAME>> >>> >>> On Sun, Feb 11, 2018 at 7:56 AM, ☼ R Nair (रविशंकर नायर) < >>> ravishankar.n...@gmail.com> wrote: >>> >>>> No, No luck. >>>> >>>> Thanks >>>> >>>> On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma <deepakmc...@gmail.com> >>>> wrote: >>>> >>>>> In hive cli: >>>>> msck repair table 《table_name》; >>>>> >>>>> Thanks >>>>> Deepak >>>>> >>>>> On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" < >>>>> ravishankar.n...@gmail.com> wrote: >>>>> >>>>>> NO, can you pease explain the command ? Let me try now. >>>>>> >>>>>> Best, >>>>>> >>>>>> On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma < >>>>>> deepakmc...@gmail.com> wrote: >>>>>> >>>>>>> I am not sure about the exact issue bjt i see you are partioning >>>>>>> while writing from spark. >>>>>>> Did you tried msck repair on the table before reading it in hive ? >>>>>>> >>>>>>> Thanks >>>>>>> Deepak >>>>>>> >>>>>>> On Feb 11, 2018 11:06, "☼ R Nair (रविशंकर नायर)" < >>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>> >>>>>>>> All, >>>>>>>> >>>>>>>> Thanks for the inputs. Again I am not successful. I think, we need >>>>>>>> to resolve this, as this is a very common requirement. >>>>>>>> >>>>>>>> Please go through my complete code: >>>>>>>> >>>>>>>> STEP 1: Started Spark shell as spark-shell --master yarn >>>>>>>> >>>>>>>> STEP 2: Flowing code is being given as inout to shark shell >>>>>>>> >>>>>>>> import org.apache.spark.sql.Row >>>>>>>> import org.apache.spark.sql.SparkSession >>>>>>>> val warehouseLocation ="/user/hive/warehouse" >>>>>>>> >>>>>>>> val spark = SparkSession.builder().appName("Spark Hive >>>>>>>> Example").config("spark.sql.warehouse.dir", >>>>>>>> warehouseLocation).enableHiveSupport().getOrCreate() >>>>>>>> >>>>>>>> import org.apache.spark.sql._ >>>>>>>> var passion_df = spark.read. >>>>>>>> format("jdbc"). >>>>>>>> option("url", "jdbc:mysql://localhost:3307/policies"). >>>>>>>> option("driver" ,"com.mysql.jdbc.Driver"). >>>>>>>> option("user", "root"). >>>>>>>> option("password", "root"). >>>>>>>> option("dbtable", "insurancedetails"). >>>>>>>> option("partitionColumn", "policyid"). >>>>>>>> option("lowerBound", "1"). >>>>>>>> option("upperBound", "100000"). >>>>>>>> option("numPartitions", "4"). >>>>>>>> load() >>>>>>>> //Made sure that passion_df is created, as passion_df.show(5) shows >>>>>>>> me correct data. >>>>>>>> passion_df.write.saveAsTable("default.mine") //Default parquet >>>>>>>> >>>>>>>> STEP 3: Went to HIVE. Started HIVE prompt. >>>>>>>> >>>>>>>> hive> show tables; >>>>>>>> OK >>>>>>>> callcentervoicelogs >>>>>>>> mine >>>>>>>> Time taken: 0.035 seconds, Fetched: 2 row(s) >>>>>>>> //As you can see HIVE is showing the table "mine" in default schema. >>>>>>>> >>>>>>>> STEP 4: HERE IS THE PROBLEM. >>>>>>>> >>>>>>>> hive> select * from mine; >>>>>>>> OK >>>>>>>> Time taken: 0.354 seconds >>>>>>>> hive> >>>>>>>> //Where is the data ??? >>>>>>>> >>>>>>>> STEP 5: >>>>>>>> >>>>>>>> See the below command on HIVE >>>>>>>> >>>>>>>> describe formatted mine; >>>>>>>> OK >>>>>>>> # col_name data_type comment >>>>>>>> >>>>>>>> policyid int >>>>>>>> statecode string >>>>>>>> socialid string >>>>>>>> county string >>>>>>>> eq_site_limit decimal(10,2) >>>>>>>> hu_site_limit decimal(10,2) >>>>>>>> fl_site_limit decimal(10,2) >>>>>>>> fr_site_limit decimal(10,2) >>>>>>>> tiv_2014 decimal(10,2) >>>>>>>> tiv_2015 decimal(10,2) >>>>>>>> eq_site_deductible int >>>>>>>> hu_site_deductible int >>>>>>>> fl_site_deductible int >>>>>>>> fr_site_deductible int >>>>>>>> latitude decimal(6,6) >>>>>>>> longitude decimal(6,6) >>>>>>>> line string >>>>>>>> construction string >>>>>>>> point_granularity int >>>>>>>> >>>>>>>> # Detailed Table Information >>>>>>>> Database: default >>>>>>>> Owner: ravishankarnair >>>>>>>> CreateTime: Sun Feb 11 00:26:40 EST 2018 >>>>>>>> LastAccessTime: UNKNOWN >>>>>>>> Protect Mode: None >>>>>>>> Retention: 0 >>>>>>>> Location: file:/Users/ravishankarnair/sp >>>>>>>> ark-warehouse/mine >>>>>>>> Table Type: MANAGED_TABLE >>>>>>>> Table Parameters: >>>>>>>> spark.sql.sources.provider parquet >>>>>>>> spark.sql.sources.schema.numParts 1 >>>>>>>> spark.sql.sources.schema.part.0 {\"type\":\"struct\",\"fields\ >>>>>>>> ":[{\"name\":\"policyid\",\"type\":\"integer\",\"nullable\": >>>>>>>> true,\"metadata\":{\"name\":\"policyid\",\"scale\":0}},{\"na >>>>>>>> me\":\"statecode\",\"type\":\"string\",\"nullable\":true,\"m >>>>>>>> etadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name\":\" >>>>>>>> Socialid\",\"type\":\"string\",\"nullable\":true,\"metadata\ >>>>>>>> ":{\"name\":\"Socialid\",\"scale\":0}},{\"name\":\"county\", >>>>>>>> \"type\":\"string\",\"nullable\":true,\"metadata\":{\"name\" >>>>>>>> :\"county\",\"scale\":0}},{\"name\":\"eq_site_limit\",\"type >>>>>>>> \":\"decimal(10,2)\",\"nullable\":true,\"metadata\":{\"name\ >>>>>>>> ":\"eq_site_limit\",\"scale\":2}},{\"name\":\"hu_site_limit\ >>>>>>>> ",\"type\":\"decimal(10,2)\",\"nullable\":true,\"metadata\": >>>>>>>> {\"name\":\"hu_site_limit\",\"scale\":2}},{\"name\":\"fl_sit >>>>>>>> e_limit\",\"type\":\"decimal(10,2)\",\"nullable\":true,\"met >>>>>>>> adata\":{\"name\":\"fl_site_limit\",\"scale\":2}},{\"name\": >>>>>>>> \"fr_site_limit\",\"type\":\"decimal(10,2)\",\"nullable\":t >>>>>>>> rue,\"metadata\":{\"name\":\"fr_site_limit\",\"scale\":2}},{ >>>>>>>> \"name\":\"tiv_2014\",\"type\":\"decimal(10,2)\",\"nullable\ >>>>>>>> ":true,\"metadata\":{\"name\":\"tiv_2014\",\"scale\":2}},{\" >>>>>>>> name\":\"tiv_2015\",\"type\":\"decimal(10,2)\",\"nullable\": >>>>>>>> true,\"metadata\":{\"name\":\"tiv_2015\",\"scale\":2}},{\" >>>>>>>> name\":\"eq_site_deductible\",\"type\":\"integer\",\" >>>>>>>> nullable\":true,\"metadata\":{\"name\":\"eq_site_deductible\ >>>>>>>> ",\"scale\":0}},{\"name\":\"hu_site_deductible\",\"type\": >>>>>>>> \"integer\",\"nullable\":true,\"metadata\":{\"name\":\"hu_ >>>>>>>> site_deductible\",\"scale\":0}},{\"name\":\"fl_site_ >>>>>>>> deductible\",\"type\":\"integer\",\"nullable\":true,\" >>>>>>>> metadata\":{\"name\":\"fl_site_deductible\",\"scale\":0}},{\ >>>>>>>> "name\":\"fr_site_deductible\",\"type\":\"integer\",\" >>>>>>>> nullable\":true,\"metadata\":{\"name\":\"fr_site_deductible\ >>>>>>>> ",\"scale\":0}},{\"name\":\"latitude\",\"type\":\"decimal( >>>>>>>> 6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"latitud >>>>>>>> e\",\"scale\":6}},{\"name\":\"longitude\",\"type\":\"decimal >>>>>>>> (6,6)\",\"nullable\":true,\"metadata\":{\"name\":\"longitu >>>>>>>> de\",\"scale\":6}},{\"name\":\"line\",\"type\":\"string\",\" >>>>>>>> nullable\":true,\"metadata\":{\"name\":\"line\",\"scale\":0} >>>>>>>> },{\"name\":\"construction\",\"type\":\"string\",\"nullable\ >>>>>>>> ":true,\"metadata\":{\"name\":\"construction\",\"scale\":0}} >>>>>>>> ,{\"name\":\"point_granularity\",\"type\":\"integer\",\" >>>>>>>> nullable\":true,\"metadata\":{\"name\":\"point_granularity\" >>>>>>>> ,\"scale\":0}}]} >>>>>>>> transient_lastDdlTime 1518326800 >>>>>>>> >>>>>>>> # Storage Information >>>>>>>> SerDe Library: org.apache.hadoop.hive.ql.io.p >>>>>>>> arquet.serde.ParquetHiveSerDe >>>>>>>> InputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>> arquet.MapredParquetInputFormat >>>>>>>> OutputFormat: org.apache.hadoop.hive.ql.io.p >>>>>>>> arquet.MapredParquetOutputFormat >>>>>>>> Compressed: No >>>>>>>> Num Buckets: -1 >>>>>>>> Bucket Columns: [] >>>>>>>> Sort Columns: [] >>>>>>>> Storage Desc Params: >>>>>>>> path hdfs://localhost:8020/user/hive/warehouse/mine >>>>>>>> serialization.format 1 >>>>>>>> Time taken: 0.077 seconds, Fetched: 48 row(s) >>>>>>>> >>>>>>>> Now, I see your advise and support. Whats the issue? Am I doing >>>>>>>> wrong, it it a bug ? I am using Spark 2.2.1, HIVE 1.2.1, HADOOP 2.7.3. >>>>>>>> All >>>>>>>> class path, configuration are set properly. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Ravion >>>>>>>> >>>>>>>> On Fri, Feb 9, 2018 at 1:29 PM, Nicholas Hakobian < >>>>>>>> nicholas.hakob...@rallyhealth.com> wrote: >>>>>>>> >>>>>>>>> Its possible that the format of your table is not compatible with >>>>>>>>> your version of hive, so Spark saved it in a way such that only Spark >>>>>>>>> can >>>>>>>>> read it. When this happens it prints out a very visible warning >>>>>>>>> letting you >>>>>>>>> know this has happened. >>>>>>>>> >>>>>>>>> We've seen it most frequently when trying to save a parquet file >>>>>>>>> with a column in date format into a Hive table. In older versions of >>>>>>>>> hive, >>>>>>>>> its parquet reader/writer did not support Date formats (among a couple >>>>>>>>> others). >>>>>>>>> >>>>>>>>> Nicholas Szandor Hakobian, Ph.D. >>>>>>>>> Staff Data Scientist >>>>>>>>> Rally Health >>>>>>>>> nicholas.hakob...@rallyhealth.com >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Feb 9, 2018 at 9:59 AM, Prakash Joshi < >>>>>>>>> prakashcjos...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Ravi, >>>>>>>>>> >>>>>>>>>> Can you send the result of >>>>>>>>>> Show create table your_table_name >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Prakash >>>>>>>>>> >>>>>>>>>> On Feb 9, 2018 8:20 PM, "☼ R Nair (रविशंकर नायर)" < >>>>>>>>>> ravishankar.n...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> All, >>>>>>>>>>> >>>>>>>>>>> It has been three days continuously I am on this issue. Not >>>>>>>>>>> getting any clue. >>>>>>>>>>> >>>>>>>>>>> Environment: Spark 2.2.x, all configurations are correct. >>>>>>>>>>> hive-site.xml is in spark's conf. >>>>>>>>>>> >>>>>>>>>>> 1) Step 1: I created a data frame DF1 reading a csv file. >>>>>>>>>>> >>>>>>>>>>> 2) Did manipulations on DF1. Resulting frame is passion_df. >>>>>>>>>>> >>>>>>>>>>> 3) passion_df.write.format("orc").saveAsTable("sampledb.passion >>>>>>>>>>> ") >>>>>>>>>>> >>>>>>>>>>> 4) The metastore shows the hive table., when I do "show tables" >>>>>>>>>>> in HIVE, I can see table name >>>>>>>>>>> >>>>>>>>>>> 5) I can't select in HIVE, though I can select from SPARK as >>>>>>>>>>> spark.sql("select * from sampledb.passion") >>>>>>>>>>> >>>>>>>>>>> Whats going on here? Please help. Why I am not seeing data from >>>>>>>>>>> HIVE prompt? >>>>>>>>>>> The "describe formatted " command on the table in HIVE shows he >>>>>>>>>>> data is is in default warehouse location ( /user/hive/warehouse) >>>>>>>>>>> since I >>>>>>>>>>> set it. >>>>>>>>>>> >>>>>>>>>>> I am not getting any definite answer anywhere. Many suggestions >>>>>>>>>>> and answers given in Stackoverflow et al.Nothing really works. >>>>>>>>>>> >>>>>>>>>>> So asking experts here for some light on this, thanks >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Ravion >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>> >>>> >>>> -- >>>> >>>> >>> >>> >>> -- >>> Shmuel Blitz >>> Big Data Developer >>> Email: shmuel.bl...@similarweb.com >>> www.similarweb.com >>> <https://www.facebook.com/SimilarWeb/> >>> <https://www.linkedin.com/company/429838/> >>> <https://twitter.com/similarweb> >>> >> >> >> >> -- >> >> > --