Hello Abe, Can you please update on this? Also let me know if you need any more info.
Thanks, Mani On Tue, Jun 30, 2015 at 11:39 AM, Manikandan R <[email protected]> wrote: > Impala 1.2.4. We are using amazon emr cluster. > > Thanks, > Mani > > On Sun, Jun 28, 2015 at 11:37 PM, Abraham Elmahrek <[email protected]> > wrote: > >> Oh that makes more sense. Seems like a format mismatch. You might have to >> upgrade impala. Mind providing the version of Impala you're using? >> >> -Abe >> >> On Fri, Jun 26, 2015 at 12:52 AM, Manikandan R <[email protected]> >> wrote: >> >>> actual errors are >>> >>> Query: select * from gwynniebee_bi.mi_test >>> ERROR: AnalysisException: Failed to load metadata for table: >>> gwynniebee_bi.mi_test >>> CAUSED BY: TableLoadingException: Unrecognized table type for table: >>> gwynniebee_bi.mi_test >>> >>> On Fri, Jun 26, 2015 at 1:21 PM, Manikandan R <[email protected]> >>> wrote: >>> >>>> It should be same as I have created many tables before in Hive and used >>>> to read the same in Impala without any issues. >>>> >>>> I am running oozie based workflows in Production environment to take >>>> the data from MySQL to HDFS (via sqoop hive imports) in raw format -> >>>> Storing the same data again in Parquet format using Impala shell and on top >>>> of it, reports are running using Impala queries. This is happening for few >>>> weeks without any issues. >>>> >>>> Now, I am trying to see whether I can import the data from mySQL to >>>> Impala (parquet) directly to avoid the Intermediate step. >>>> >>>> >>>> >>>> On Fri, Jun 26, 2015 at 1:02 PM, Abraham Elmahrek <[email protected]> >>>> wrote: >>>> >>>>> Check your config. They should use the same metastore. >>>>> >>>>> On Fri, Jun 26, 2015 at 12:26 AM, Manikandan R <[email protected]> >>>>> wrote: >>>>> >>>>>> Yes, it works. I set HCAT_HOME as HIVE_HOME/hcatalog. >>>>>> >>>>>> I can able to read data from Hive, but not from Impala shell. Any >>>>>> workaround? >>>>>> >>>>>> Thanks, >>>>>> Mani >>>>>> >>>>>> On Thu, Jun 25, 2015 at 7:27 PM, Abraham Elmahrek <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Make sure HIVE_HOME and HCAT_HOME are set. >>>>>>> >>>>>>> For the datetime/timestamp issue... this is because parquet doesn't >>>>>>> support timestamp types yet. Avro schemas support them as of 1.8.0 >>>>>>> apparently: https://issues.apache.org/jira/browse/AVRO-739. Try >>>>>>> casting to a numeric or string value first? >>>>>>> >>>>>>> -Abe >>>>>>> >>>>>>> On Thu, Jun 25, 2015 at 6:49 AM, Manikandan R <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I am running >>>>>>>> >>>>>>>> ./sqoop import --connect jdbc:mysql:// >>>>>>>> ups.db.gwynniebee.com/gwynniebee_bats --username root --password >>>>>>>> gwynniebee --table bats_active --hive-import --hive-database >>>>>>>> gwynniebee_bi >>>>>>>> --hive-table test_pq_bats_active --null-string '\\N' --null-non-string >>>>>>>> '\\N' --as-parquetfile -m1 >>>>>>>> >>>>>>>> and getting the below exception. I come to know from various >>>>>>>> sources that $HIVE_HOME has to be set properly to avoid these kind of >>>>>>>> errors. In my case, corresponding home directory exists. But, still it >>>>>>>> is >>>>>>>> throwing the below exception. >>>>>>>> >>>>>>>> 15/06/25 13:24:19 WARN spi.Registration: Not loading URI patterns >>>>>>>> in org.kitesdk.data.spi.hive.Loader >>>>>>>> 15/06/25 13:24:19 ERROR sqoop.Sqoop: Got exception running Sqoop: >>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: >>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>> datasets >>>>>>>> are on the classpath. >>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: >>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>> datasets >>>>>>>> are on the classpath. >>>>>>>> at >>>>>>>> org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109) >>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:228) >>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:307) >>>>>>>> at >>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107) >>>>>>>> at >>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89) >>>>>>>> at >>>>>>>> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108) >>>>>>>> at >>>>>>>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) >>>>>>>> at >>>>>>>> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) >>>>>>>> at >>>>>>>> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) >>>>>>>> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) >>>>>>>> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) >>>>>>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:143) >>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>>>>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) >>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) >>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) >>>>>>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:236) >>>>>>>> >>>>>>>> So, I tried an alternative solution, creating an parquet file first >>>>>>>> without any hive related options and creating an table referring to the >>>>>>>> same location in Impala. It worked fine. But, it is throwing the below >>>>>>>> issues ( I think it is because of date related columns). >>>>>>>> >>>>>>>> ERROR: File hdfs:// >>>>>>>> 10.183.138.137:9000/data/gwynniebee_bi/test_pq_bats_active/a4a65639-ae38-417e-bbd9-56f4eb76c06b.parquet >>>>>>>> has an incompatible type with the table schema for column create_date. >>>>>>>> Expected type: BYTE_ARRAY. Actual type: INT64 >>>>>>>> >>>>>>>> Then, I tried table without datetime columns. It is working fine in >>>>>>>> this case. >>>>>>>> >>>>>>>> I am using hive 0.13 and sqoop-1.4.6.bin__hadoop-2.0.4-alpha bin. >>>>>>>> >>>>>>>> I would prefer first approach for my requirements. Can anyone >>>>>>>> please help me in this regard? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mani >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
