Could you try upgrading Impala? -Abe
On Wed, Jul 1, 2015 at 10:27 PM, Manikandan R <[email protected]> wrote: > Hello Abe, > > Can you please update on this? Also let me know if you need any more info. > > Thanks, > Mani > > On Tue, Jun 30, 2015 at 11:39 AM, Manikandan R <[email protected]> > wrote: > >> Impala 1.2.4. We are using amazon emr cluster. >> >> Thanks, >> Mani >> >> On Sun, Jun 28, 2015 at 11:37 PM, Abraham Elmahrek <[email protected]> >> wrote: >> >>> Oh that makes more sense. Seems like a format mismatch. You might have >>> to upgrade impala. Mind providing the version of Impala you're using? >>> >>> -Abe >>> >>> On Fri, Jun 26, 2015 at 12:52 AM, Manikandan R <[email protected]> >>> wrote: >>> >>>> actual errors are >>>> >>>> Query: select * from gwynniebee_bi.mi_test >>>> ERROR: AnalysisException: Failed to load metadata for table: >>>> gwynniebee_bi.mi_test >>>> CAUSED BY: TableLoadingException: Unrecognized table type for table: >>>> gwynniebee_bi.mi_test >>>> >>>> On Fri, Jun 26, 2015 at 1:21 PM, Manikandan R <[email protected]> >>>> wrote: >>>> >>>>> It should be same as I have created many tables before in Hive and >>>>> used to read the same in Impala without any issues. >>>>> >>>>> I am running oozie based workflows in Production environment to take >>>>> the data from MySQL to HDFS (via sqoop hive imports) in raw format -> >>>>> Storing the same data again in Parquet format using Impala shell and on >>>>> top >>>>> of it, reports are running using Impala queries. This is happening for few >>>>> weeks without any issues. >>>>> >>>>> Now, I am trying to see whether I can import the data from mySQL to >>>>> Impala (parquet) directly to avoid the Intermediate step. >>>>> >>>>> >>>>> >>>>> On Fri, Jun 26, 2015 at 1:02 PM, Abraham Elmahrek <[email protected]> >>>>> wrote: >>>>> >>>>>> Check your config. They should use the same metastore. >>>>>> >>>>>> On Fri, Jun 26, 2015 at 12:26 AM, Manikandan R <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Yes, it works. I set HCAT_HOME as HIVE_HOME/hcatalog. >>>>>>> >>>>>>> I can able to read data from Hive, but not from Impala shell. Any >>>>>>> workaround? >>>>>>> >>>>>>> Thanks, >>>>>>> Mani >>>>>>> >>>>>>> On Thu, Jun 25, 2015 at 7:27 PM, Abraham Elmahrek <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Make sure HIVE_HOME and HCAT_HOME are set. >>>>>>>> >>>>>>>> For the datetime/timestamp issue... this is because parquet doesn't >>>>>>>> support timestamp types yet. Avro schemas support them as of 1.8.0 >>>>>>>> apparently: https://issues.apache.org/jira/browse/AVRO-739. Try >>>>>>>> casting to a numeric or string value first? >>>>>>>> >>>>>>>> -Abe >>>>>>>> >>>>>>>> On Thu, Jun 25, 2015 at 6:49 AM, Manikandan R <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am running >>>>>>>>> >>>>>>>>> ./sqoop import --connect jdbc:mysql:// >>>>>>>>> ups.db.gwynniebee.com/gwynniebee_bats --username root --password >>>>>>>>> gwynniebee --table bats_active --hive-import --hive-database >>>>>>>>> gwynniebee_bi >>>>>>>>> --hive-table test_pq_bats_active --null-string '\\N' --null-non-string >>>>>>>>> '\\N' --as-parquetfile -m1 >>>>>>>>> >>>>>>>>> and getting the below exception. I come to know from various >>>>>>>>> sources that $HIVE_HOME has to be set properly to avoid these kind of >>>>>>>>> errors. In my case, corresponding home directory exists. But, still >>>>>>>>> it is >>>>>>>>> throwing the below exception. >>>>>>>>> >>>>>>>>> 15/06/25 13:24:19 WARN spi.Registration: Not loading URI patterns >>>>>>>>> in org.kitesdk.data.spi.hive.Loader >>>>>>>>> 15/06/25 13:24:19 ERROR sqoop.Sqoop: Got exception running Sqoop: >>>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: >>>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>>> datasets >>>>>>>>> are on the classpath. >>>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: >>>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>>> datasets >>>>>>>>> are on the classpath. >>>>>>>>> at >>>>>>>>> org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109) >>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:228) >>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:307) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) >>>>>>>>> at >>>>>>>>> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) >>>>>>>>> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) >>>>>>>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:143) >>>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>>>>>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) >>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) >>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) >>>>>>>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:236) >>>>>>>>> >>>>>>>>> So, I tried an alternative solution, creating an parquet file >>>>>>>>> first without any hive related options and creating an table >>>>>>>>> referring to >>>>>>>>> the same location in Impala. It worked fine. But, it is throwing the >>>>>>>>> below >>>>>>>>> issues ( I think it is because of date related columns). >>>>>>>>> >>>>>>>>> ERROR: File hdfs:// >>>>>>>>> 10.183.138.137:9000/data/gwynniebee_bi/test_pq_bats_active/a4a65639-ae38-417e-bbd9-56f4eb76c06b.parquet >>>>>>>>> has an incompatible type with the table schema for column create_date. >>>>>>>>> Expected type: BYTE_ARRAY. Actual type: INT64 >>>>>>>>> >>>>>>>>> Then, I tried table without datetime columns. It is working fine >>>>>>>>> in this case. >>>>>>>>> >>>>>>>>> I am using hive 0.13 and sqoop-1.4.6.bin__hadoop-2.0.4-alpha bin. >>>>>>>>> >>>>>>>>> I would prefer first approach for my requirements. Can anyone >>>>>>>>> please help me in this regard? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Mani >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
