Hey man, Can you start a separate thread for this? I'd add details like:
- version - command - --verbose output -Abe On Fri, Jul 17, 2015 at 7:35 AM, Anupam sinha <[email protected]> wrote: > I have face similar issue like, sqoop not working in access node, > get the error "SQLServer test failed (1)" > > do i need to change any setting > > On Thu, Jul 2, 2015 at 11:59 AM, Manikandan R <[email protected]> > wrote: > >> Ok, thanks >> >> On Thu, Jul 2, 2015 at 11:38 AM, Abraham Elmahrek <[email protected]> >> wrote: >> >>> I'd check with the impala user group! But I think 1.2.4 is an older >>> version. Upgrading might make your headaches go away in general. >>> >>> -Abe >>> >>> On Wed, Jul 1, 2015 at 11:04 PM, Manikandan R <[email protected]> >>> wrote: >>> >>>> Ok, Abe. I will try for that. >>>> >>>> Also, for the past 2 days, Impalad is getting crashed in 1 node >>>> particularly. Because of this, oozie workflows are taking huge amount of >>>> time to complete. Even, it is not getting completed after 24 hours. We used >>>> to restart the daemon, it works fine for sometime. Again, it crashes. It >>>> doesn't seem very stable. >>>> >>>> I've attached error report file. Please check. >>>> >>>> Thanks, >>>> Mani >>>> >>>> >>>> On Thu, Jul 2, 2015 at 11:11 AM, Abraham Elmahrek <[email protected]> >>>> wrote: >>>> >>>>> Could you try upgrading Impala? >>>>> >>>>> -Abe >>>>> >>>>> On Wed, Jul 1, 2015 at 10:27 PM, Manikandan R <[email protected]> >>>>> wrote: >>>>> >>>>>> Hello Abe, >>>>>> >>>>>> Can you please update on this? Also let me know if you need any more >>>>>> info. >>>>>> >>>>>> Thanks, >>>>>> Mani >>>>>> >>>>>> On Tue, Jun 30, 2015 at 11:39 AM, Manikandan R <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Impala 1.2.4. We are using amazon emr cluster. >>>>>>> >>>>>>> Thanks, >>>>>>> Mani >>>>>>> >>>>>>> On Sun, Jun 28, 2015 at 11:37 PM, Abraham Elmahrek <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Oh that makes more sense. Seems like a format mismatch. You might >>>>>>>> have to upgrade impala. Mind providing the version of Impala you're >>>>>>>> using? >>>>>>>> >>>>>>>> -Abe >>>>>>>> >>>>>>>> On Fri, Jun 26, 2015 at 12:52 AM, Manikandan R < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> actual errors are >>>>>>>>> >>>>>>>>> Query: select * from gwynniebee_bi.mi_test >>>>>>>>> ERROR: AnalysisException: Failed to load metadata for table: >>>>>>>>> gwynniebee_bi.mi_test >>>>>>>>> CAUSED BY: TableLoadingException: Unrecognized table type for >>>>>>>>> table: gwynniebee_bi.mi_test >>>>>>>>> >>>>>>>>> On Fri, Jun 26, 2015 at 1:21 PM, Manikandan R < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> It should be same as I have created many tables before in Hive >>>>>>>>>> and used to read the same in Impala without any issues. >>>>>>>>>> >>>>>>>>>> I am running oozie based workflows in Production environment to >>>>>>>>>> take the data from MySQL to HDFS (via sqoop hive imports) in raw >>>>>>>>>> format -> >>>>>>>>>> Storing the same data again in Parquet format using Impala shell and >>>>>>>>>> on top >>>>>>>>>> of it, reports are running using Impala queries. This is happening >>>>>>>>>> for few >>>>>>>>>> weeks without any issues. >>>>>>>>>> >>>>>>>>>> Now, I am trying to see whether I can import the data from mySQL >>>>>>>>>> to Impala (parquet) directly to avoid the Intermediate step. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jun 26, 2015 at 1:02 PM, Abraham Elmahrek < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Check your config. They should use the same metastore. >>>>>>>>>>> >>>>>>>>>>> On Fri, Jun 26, 2015 at 12:26 AM, Manikandan R < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yes, it works. I set HCAT_HOME as HIVE_HOME/hcatalog. >>>>>>>>>>>> >>>>>>>>>>>> I can able to read data from Hive, but not from Impala shell. >>>>>>>>>>>> Any workaround? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Mani >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 25, 2015 at 7:27 PM, Abraham Elmahrek < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Make sure HIVE_HOME and HCAT_HOME are set. >>>>>>>>>>>>> >>>>>>>>>>>>> For the datetime/timestamp issue... this is because parquet >>>>>>>>>>>>> doesn't support timestamp types yet. Avro schemas support them as >>>>>>>>>>>>> of 1.8.0 >>>>>>>>>>>>> apparently: https://issues.apache.org/jira/browse/AVRO-739. >>>>>>>>>>>>> Try casting to a numeric or string value first? >>>>>>>>>>>>> >>>>>>>>>>>>> -Abe >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 25, 2015 at 6:49 AM, Manikandan R < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am running >>>>>>>>>>>>>> >>>>>>>>>>>>>> ./sqoop import --connect jdbc:mysql:// >>>>>>>>>>>>>> ups.db.gwynniebee.com/gwynniebee_bats --username root >>>>>>>>>>>>>> --password gwynniebee --table bats_active --hive-import >>>>>>>>>>>>>> --hive-database >>>>>>>>>>>>>> gwynniebee_bi --hive-table test_pq_bats_active --null-string >>>>>>>>>>>>>> '\\N' >>>>>>>>>>>>>> --null-non-string '\\N' --as-parquetfile -m1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> and getting the below exception. I come to know from various >>>>>>>>>>>>>> sources that $HIVE_HOME has to be set properly to avoid these >>>>>>>>>>>>>> kind of >>>>>>>>>>>>>> errors. In my case, corresponding home directory exists. But, >>>>>>>>>>>>>> still it is >>>>>>>>>>>>>> throwing the below exception. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 15/06/25 13:24:19 WARN spi.Registration: Not loading URI >>>>>>>>>>>>>> patterns in org.kitesdk.data.spi.hive.Loader >>>>>>>>>>>>>> 15/06/25 13:24:19 ERROR sqoop.Sqoop: Got exception running >>>>>>>>>>>>>> Sqoop: org.kitesdk.data.DatasetNotFoundException: Unknown >>>>>>>>>>>>>> dataset URI: >>>>>>>>>>>>>> hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for >>>>>>>>>>>>>> hive datasets >>>>>>>>>>>>>> are on the classpath. >>>>>>>>>>>>>> org.kitesdk.data.DatasetNotFoundException: Unknown dataset >>>>>>>>>>>>>> URI: hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs >>>>>>>>>>>>>> for hive >>>>>>>>>>>>>> datasets are on the classpath. >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109) >>>>>>>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:228) >>>>>>>>>>>>>> at org.kitesdk.data.Datasets.create(Datasets.java:307) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) >>>>>>>>>>>>>> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) >>>>>>>>>>>>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:143) >>>>>>>>>>>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >>>>>>>>>>>>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) >>>>>>>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) >>>>>>>>>>>>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) >>>>>>>>>>>>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:236) >>>>>>>>>>>>>> >>>>>>>>>>>>>> So, I tried an alternative solution, creating an parquet file >>>>>>>>>>>>>> first without any hive related options and creating an table >>>>>>>>>>>>>> referring to >>>>>>>>>>>>>> the same location in Impala. It worked fine. But, it is throwing >>>>>>>>>>>>>> the below >>>>>>>>>>>>>> issues ( I think it is because of date related columns). >>>>>>>>>>>>>> >>>>>>>>>>>>>> ERROR: File hdfs:// >>>>>>>>>>>>>> 10.183.138.137:9000/data/gwynniebee_bi/test_pq_bats_active/a4a65639-ae38-417e-bbd9-56f4eb76c06b.parquet >>>>>>>>>>>>>> has an incompatible type with the table schema for column >>>>>>>>>>>>>> create_date. >>>>>>>>>>>>>> Expected type: BYTE_ARRAY. Actual type: INT64 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Then, I tried table without datetime columns. It is working >>>>>>>>>>>>>> fine in this case. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am using hive 0.13 and sqoop-1.4.6.bin__hadoop-2.0.4-alpha >>>>>>>>>>>>>> bin. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would prefer first approach for my requirements. Can anyone >>>>>>>>>>>>>> please help me in this regard? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Mani >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
