Josh Wills created SQOOP-1780:
---------------------------------

             Summary: Avro/Parquet schemas can't handle Sqoop-generated 
non-alphanumeric column names
                 Key: SQOOP-1780
                 URL: https://issues.apache.org/jira/browse/SQOOP-1780
             Project: Sqoop
          Issue Type: Bug
    Affects Versions: 1.4.5
            Reporter: Josh Wills


I was importing a MySQL table that had columns that started with a number (1QP, 
2QP, etc.). It looks like Sqoop appends an underscore on the front of those 
names to make them compatible with Hive, but Parquet/Avro schemas can't handle 
the non-alphanumeric value in the name of a field (or at least, at the start of 
it), throwing the following exception:

java.lang.IllegalStateException: Deprecated: field names are not alphanumeric 
(plus '_'): sqoop_import_team._1QP, sqoop_import_team._2QP, 
sqoop_import_team._3QP, sqoop_import_team._4QP
        at 
com.google.common.base.Preconditions.checkState(Preconditions.java:172)
        at 
org.kitesdk.data.spi.Compatibility.checkSchema(Compatibility.java:119)
        at 
org.kitesdk.data.spi.Compatibility.checkDescriptor(Compatibility.java:133)
        at 
org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:40)
        at 
org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:76)
        at org.kitesdk.data.Datasets.create(Datasets.java:200)
        at org.kitesdk.data.Datasets.create(Datasets.java:240)
        at 
org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
        at 
org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to