Ok, thanks On Thu, Jul 2, 2015 at 11:38 AM, Abraham Elmahrek <[email protected]> wrote:
> I'd check with the impala user group! But I think 1.2.4 is an older > version. Upgrading might make your headaches go away in general. > > -Abe > > On Wed, Jul 1, 2015 at 11:04 PM, Manikandan R <[email protected]> > wrote: > >> Ok, Abe. I will try for that. >> >> Also, for the past 2 days, Impalad is getting crashed in 1 node >> particularly. Because of this, oozie workflows are taking huge amount of >> time to complete. Even, it is not getting completed after 24 hours. We used >> to restart the daemon, it works fine for sometime. Again, it crashes. It >> doesn't seem very stable. >> >> I've attached error report file. Please check. >> >> Thanks, >> Mani >> >> >> On Thu, Jul 2, 2015 at 11:11 AM, Abraham Elmahrek <[email protected]> >> wrote: >> >>> Could you try upgrading Impala? >>> >>> -Abe >>> >>> On Wed, Jul 1, 2015 at 10:27 PM, Manikandan R <[email protected]> >>> wrote: >>> >>>> Hello Abe, >>>> >>>> Can you please update on this? Also let me know if you need any more >>>> info. >>>> >>>> Thanks, >>>> Mani >>>> >>>> On Tue, Jun 30, 2015 at 11:39 AM, Manikandan R <[email protected]> >>>> wrote: >>>> >>>>> Impala 1.2.4. We are using amazon emr cluster. >>>>> >>>>> Thanks, >>>>> Mani >>>>> >>>>> On Sun, Jun 28, 2015 at 11:37 PM, Abraham Elmahrek <[email protected]> >>>>> wrote: >>>>> >>>>>> Oh that makes more sense. Seems like a format mismatch. You might >>>>>> have to upgrade impala. Mind providing the version of Impala you're >>>>>> using? >>>>>> >>>>>> -Abe >>>>>> >>>>>> On Fri, Jun 26, 2015 at 12:52 AM, Manikandan R <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> actual errors are >>>>>>> >>>>>>> Query: select * from gwynniebee_bi.mi_test >>>>>>> ERROR: AnalysisException: Failed to load metadata for table: >>>>>>> gwynniebee_bi.mi_test >>>>>>> CAUSED BY: TableLoadingException: Unrecognized table type for table: >>>>>>> gwynniebee_bi.mi_test >>>>>>> >>>>>>> On Fri, Jun 26, 2015 at 1:21 PM, Manikandan R <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> It should be same as I have created many tables before in Hive and >>>>>>>> used to read the same in Impala without any issues. >>>>>>>> >>>>>>>> I am running oozie based workflows in Production environment to >>>>>>>> take the data from MySQL to HDFS (via sqoop hive imports) in raw >>>>>>>> format -> >>>>>>>> Storing the same data again in Parquet format using Impala shell and >>>>>>>> on top >>>>>>>> of it, reports are running using Impala queries. This is happening for >>>>>>>> few >>>>>>>> weeks without any issues. >>>>>>>> >>>>>>>> Now, I am trying to see whether I can import the data from mySQL to >>>>>>>> Impala (parquet) directly to avoid the Intermediate step. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jun 26, 2015 at 1:02 PM, Abraham Elmahrek <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Check your config. They should use the same metastore. >>>>>>>>> >>>>>>>>> On Fri, Jun 26, 2015 at 12:26 AM, Manikandan R < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Yes, it works. I set HCAT_HOME as HIVE_HOME/hcatalog. >>>>>>>>>> >>>>>>>>>> I can able to read data from Hive, but not from Impala shell. Any >>>>>>>>>> workaround? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Mani >>>>>>>>>> >>>>>>>>>> On Thu, Jun 25, 2015 at 7:27 PM, Abraham Elmahrek < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Make sure HIVE_HOME and HCAT_HOME are set. >>>>>>>>>>> >>>>>>>>>>> For the datetime/timestamp issue... this is because parquet >>>>>>>>>>> doesn't support timestamp types yet. Avro schemas support them as >>>>>>>>>>> of 1.8.0 >>>>>>>>>>> apparently: https://issues.apache.org/jira/browse/AVRO-739. Try >>>>>>>>>>> casting to a numeric or string value first? >>>>>>>>>>> >>>>>>>>>>> -Abe >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 25, 2015 at 6:49 AM, Manikandan R < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I am running >>>>>>>>>>>> >>>>>>>>>>>> ./sqoop import --connect jdbc:mysql:// >>>>>>>>>>>> ups.db.gwynniebee.com/gwynniebee_bats --username root >>>>>>>>>>>> --password gwynniebee --table bats_active --hive-import >>>>>>>>>>>> --hive-database >>>>>>>>>>>> gwynniebee_bi --hive-table test_pq_bats_active --null-string '\\N' >>>>>>>>>>>> --null-non-string '\\N' --as-parquetfile -m1 >>>>>>>>>>>> >>>>>>>>>>>> and getting the below exception. I come to know from various >>>>>>>>>>>> sources that $HIVE_HOME has to be set properly to avoid these kind >>>>>>>>>>>> of >>>>>>>>>>>> errors. In my case, corresponding home directory exists. But, >>>>>>>>>>>> still it is >>>>>>>>>>>> throwing the below exception. >>>>>>>>>>>> >>>>>>>>>>>> 15/06/25 13:24:19 WARN spi.Registration: Not loading URI >>>>>>>>>>>> patterns in org.kitesdk.data.spi.hive.Loader >>>>>>>>>>>> 15/06/25 13:24:19 ERROR sqoop.Sqoop: Got exception running >>>>>>>>>>>> Sqoop: org.kitesdk.data.DatasetNotFoundException: Unknown dataset >>>>>>>>>>>> URI: >>>>>>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>>>>>> datasets >>>>>>>>>>>> are on the classpath. >>>>>>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI: >>>>>>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive >>>>>>>>>>>> datasets >>>>>>>>>>>> are on the classpath. >>>>>>>>>>>> at >>>>>>>>>>>> org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109) >>>>>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:228) >>>>>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:307) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) >>>>>>>>>>>> at >>>>>>>>>>>> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) >>>>>>>>>>>> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) >>>>>>>>>>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:143) >>>>>>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>>>>>>>>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) >>>>>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) >>>>>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) >>>>>>>>>>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:236) >>>>>>>>>>>> >>>>>>>>>>>> So, I tried an alternative solution, creating an parquet file >>>>>>>>>>>> first without any hive related options and creating an table >>>>>>>>>>>> referring to >>>>>>>>>>>> the same location in Impala. It worked fine. But, it is throwing >>>>>>>>>>>> the below >>>>>>>>>>>> issues ( I think it is because of date related columns). >>>>>>>>>>>> >>>>>>>>>>>> ERROR: File hdfs:// >>>>>>>>>>>> 10.183.138.137:9000/data/gwynniebee_bi/test_pq_bats_active/a4a65639-ae38-417e-bbd9-56f4eb76c06b.parquet >>>>>>>>>>>> has an incompatible type with the table schema for column >>>>>>>>>>>> create_date. >>>>>>>>>>>> Expected type: BYTE_ARRAY. Actual type: INT64 >>>>>>>>>>>> >>>>>>>>>>>> Then, I tried table without datetime columns. It is working >>>>>>>>>>>> fine in this case. >>>>>>>>>>>> >>>>>>>>>>>> I am using hive 0.13 and sqoop-1.4.6.bin__hadoop-2.0.4-alpha >>>>>>>>>>>> bin. >>>>>>>>>>>> >>>>>>>>>>>> I would prefer first approach for my requirements. Can anyone >>>>>>>>>>>> please help me in this regard? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Mani >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
