Re: Hive import is not compatible with importing into AVRO format

pratik khadloya Fri, 12 Sep 2014 12:05:06 -0700

Am using sqoop from trunk.

On Fri, Sep 12, 2014 at 12:02 PM, pratik khadloya <[email protected]>
wrote:


> Am getting the following error when i am trying to import a table in
> parquet format into hive using hcatalog.
>
> $ bin/sqoop import -jt myjt:xxxx --connect jdbc:mysql://
> mydbserver.net/mydb --username myuser --password mypwd --query
> "SELECT.... WHERE \$CONDITIONS" --num-mappers 1 --hcatalog-storage-stanza
> "STORED AS PARQUET" --create-hcatalog-table --hcatalog-table abc1234
>
> Please set $HBASE_HOME to the root of your HBase installation.
> Warning: /home/pkhadloya/sqoop-57336d7/bin/../../accumulo does not exist!
> Accumulo imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> 14/09/12 14:58:31 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-SNAPSHOT
> 14/09/12 14:58:31 WARN tool.BaseSqoopTool: Setting your password on the
> command-line is insecure. Consider using -P instead.
> 14/09/12 14:58:31 INFO manager.SqlManager: Using default fetchSize of 1000
> 14/09/12 14:58:31 INFO tool.CodeGenTool: Beginning code generation
> 14/09/12 14:58:31 INFO manager.SqlManager: Executing SQL statement: SELECT
> ...
> 14/09/12 14:58:31 INFO manager.SqlManager: Executing SQL statement: SELECT
> ...
> 14/09/12 14:58:31 INFO manager.SqlManager: Executing SQL statement: SELECT
> ...
> 14/09/12 14:58:31 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> /usr/lib/hadoop-0.20-mapreduce
> Note:
> /tmp/sqoop-myuser/compile/a8858915f0a296d14457738acc0f6f77/QueryResult.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 14/09/12 14:58:33 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-myuser/compile/a8858915f0a296d14457738acc0f6f77/QueryResult.jar
> 14/09/12 14:58:33 INFO mapreduce.ImportJobBase: Beginning query import.
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Configuring HCatalog for
> import job
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Configuring HCatalog
> specific details for job
> 14/09/12 14:58:33 INFO manager.SqlManager: Executing SQL statement:
> SELECT...
> 14/09/12 14:58:33 INFO manager.SqlManager: Executing SQL statement:
> SELECT...
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Database column names
> projected : [sid, pid, pna]
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Database column name -
> info map :
> sid : [Type : -5,Precision : 20,Scale : 0]
> pid : [Type : -5,Precision : 20,Scale : 0]
> pna : [Type : 12,Precision : 255,Scale : 0]
>
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Creating HCatalog table
> default.abc1234 for import
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: HCatalog Create table
> statement:
>
> create table `default`.`abc1234` (
> `sid` bigint,
> `pid` bigint,
> `pna` varchar(255))
> STORED AS PARQUET
> 14/09/12 14:58:33 INFO hcat.SqoopHCatUtilities: Executing external
> HCatalog CLI process with args :-f,/tmp/hcat-script-1410548313797
> 14/09/12 14:58:39 INFO hcat.SqoopHCatUtilities: Launching Job 1 out of 0
> 14/09/12 14:58:39 INFO hcat.SqoopHCatUtilities: FAILED: Execution Error,
> return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. doubleTypeInfo
> 14/09/12 14:58:39 ERROR tool.ImportTool: Encountered IOException running
> import job: java.io.IOException: HCat exited with status 1
> at
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.executeExternalHCatProgram(SqoopHCatUtilities.java:1113)
> at
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.launchHCatCli(SqoopHCatUtilities.java:1062)
> at
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.createHCatTable(SqoopHCatUtilities.java:595)
> at
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:318)
> at
> org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:753)
> at
> org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
> at
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:252)
> at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:721)
> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499)
> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
> at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
> at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
>
>
> On Fri, Sep 12, 2014 at 11:40 AM, pratik khadloya <[email protected]>
> wrote:
>
>> Thanks Venkat. Would importing the table as a hcat table instead of a
>> hive table automatically put it in hive?
>>
>> ~Pratik
>>
>> On Fri, Sep 12, 2014 at 10:22 AM, Venkat Ranganathan <
>> [email protected]> wrote:
>>
>>> Generally, you should be able to use any storage format that hive
>>> supports with hcatalog import or export  (of course some formats may not
>>> work if they don't support the hcatalog used hive serde methods like
>>> parquet for example - but you can directly import to parquet with
>>> --as-parquetfile
>>>
>>> Instead of --hive-import and --hive-table, just use --hcatalog-table
>>> <hivetablename>
>>>
>>> Venkat
>>>
>>> On Fri, Sep 12, 2014 at 10:12 AM, pratik khadloya <[email protected]>
>>> wrote:
>>>
>>>> Do we need HCAT_HOME if i am only importing to hive? I don't think i
>>>> have hcatalog installed.
>>>>
>>>> ~Pratik
>>>>
>>>> On Thu, Sep 11, 2014 at 7:16 PM, Xu, Qian A <[email protected]>
>>>> wrote:
>>>>
>>>>>  Yes. Simply replace `--as-avrodatafile` with `--as-parquetfile`.
>>>>>
>>>>>
>>>>>
>>>>> Please make sure the environment variables HIVE_HOME and HCAT_HOME are
>>>>> set correctly.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Qian Xu (Stanley)
>>>>>
>>>>>
>>>>>
>>>>> *From:* pratik khadloya [mailto:[email protected]]
>>>>> *Sent:* Friday, September 12, 2014 10:12 AM
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Hive import is not compatible with importing into AVRO
>>>>> format
>>>>>
>>>>>
>>>>>
>>>>> Oh ok, thanks for the information Xu. Can it be invoked using
>>>>> --as-parquetfile with --hive-import ?
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Pratik
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 11, 2014 at 6:17 PM, Xu, Qian A <[email protected]>
>>>>> wrote:
>>>>>
>>>>> Unfortunately, Avro format is not supported for a Hive import. You can
>>>>> fire a JIRA for that. Note that the trunk version of Sqoop1 supports Hive
>>>>> import as Parquet.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Qian Xu (Stanley)
>>>>>
>>>>>
>>>>>
>>>>> *From:* [email protected] [mailto:[email protected]]
>>>>> *Sent:* Friday, September 12, 2014 8:56 AM
>>>>> *To:* [email protected]
>>>>> *Subject:* Re: Hive import is not compatible with importing into AVRO
>>>>> format
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hey,there:
>>>>>
>>>>>  Does hive support the format of avroFile.As I know it just supoort
>>>>> rcfile,textfile,sequencefile.Hope this helpful to you.
>>>>>
>>>>>
>>>>>
>>>>> *From:* pratik khadloya <[email protected]>
>>>>>
>>>>> *Date:* 2014-09-12 08:26
>>>>>
>>>>> *To:* [email protected]
>>>>>
>>>>> *Subject:* Hive import is not compatible with importing into AVRO
>>>>> format
>>>>>
>>>>> I am trying to import data from a free form mysql query into hive. I
>>>>> need the files to be as AVRO data files, but when i pass the 
>>>>> --as-avrodatafile
>>>>> option, i get a compatibility error. Is there a way i can tell sqoop to 
>>>>> use
>>>>> the avro file format?
>>>>>
>>>>>
>>>>>
>>>>> $ bin/sqoop import -jt <jobtracker> --connect jdbc:mysql://<mydbserver>
>>>>> */*<mydb> --username <dbuser> --password <dbpwd> --target-dir
>>>>> /user/pkhadloya/sqoop/mytable --query “<my query> WHERE \$CONDITIONS"
>>>>> --num-mappers 1 --hive-import --hive-table mytable --create-hive-table
>>>>>  --as-avrodatafile
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ~Pratik
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>

Re: Hive import is not compatible with importing into AVRO format

Reply via email to