[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216521#comment-15216521 ] Hudson commented on SQOOP-2561: --- FAILURE: Integrated in Sqoop-hadoop200 #1049 (See [https://builds.apache.org/job/Sqoop-hadoop200/1049/]) SQOOP-2561: Special Character removal from Column name as avro data (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=1dd50cfb2ae327b0df8393dd96d1adb86bb2f65f]) * src/test/com/cloudera/sqoop/TestAvroImport.java * src/java/org/apache/sqoop/avro/AvroUtil.java > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216402#comment-15216402 ] Hudson commented on SQOOP-2561: --- FAILURE: Integrated in Sqoop-hadoop100 #1009 (See [https://builds.apache.org/job/Sqoop-hadoop100/1009/]) SQOOP-2561: Special Character removal from Column name as avro data (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=1dd50cfb2ae327b0df8393dd96d1adb86bb2f65f]) * src/java/org/apache/sqoop/avro/AvroUtil.java * src/test/com/cloudera/sqoop/TestAvroImport.java > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216312#comment-15216312 ] Hudson commented on SQOOP-2561: --- FAILURE: Integrated in Sqoop-hadoop23 #1247 (See [https://builds.apache.org/job/Sqoop-hadoop23/1247/]) SQOOP-2561: Special Character removal from Column name as avro data (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=1dd50cfb2ae327b0df8393dd96d1adb86bb2f65f]) * src/java/org/apache/sqoop/avro/AvroUtil.java * src/test/com/cloudera/sqoop/TestAvroImport.java > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216290#comment-15216290 ] Hudson commented on SQOOP-2561: --- FAILURE: Integrated in Sqoop-hadoop20 #1044 (See [https://builds.apache.org/job/Sqoop-hadoop20/1044/]) SQOOP-2561: Special Character removal from Column name as avro data (jarcec: [https://git-wip-us.apache.org/repos/asf?p=sqoop.git=commit=1dd50cfb2ae327b0df8393dd96d1adb86bb2f65f]) * src/java/org/apache/sqoop/avro/AvroUtil.java * src/test/com/cloudera/sqoop/TestAvroImport.java > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216284#comment-15216284 ] ASF subversion and git services commented on SQOOP-2561: Commit 1dd50cfb2ae327b0df8393dd96d1adb86bb2f65f in sqoop's branch refs/heads/trunk from [~jarcec] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=1dd50cf ] SQOOP-2561: Special Character removal from Column name as avro data results in duplicate column and fails the import (VISHNU S NAIR via Jarek Jarcec Cecho) > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215423#comment-15215423 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], Pach corrceted please check the same. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214386#comment-15214386 ] Jarek Jarcec Cecho commented on SQOOP-2561: --- Left one comment on the review board [~vishnusn]. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214073#comment-15214073 ] VISHNU S NAIR commented on SQOOP-2561: -- H [~jarcec], Indentation Issue. There is no whitespace while we are opening this files in notepad. But while checking difference in review board , we can see an wxtra white sapce. Kindly check the same > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203710#comment-15203710 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], could you please review this issue. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15189399#comment-15189399 ] Jarek Jarcec Cecho commented on SQOOP-2561: --- I'm not concerned about fiels that start with underscore [~vishnusn]. But I do believe that if a table would have two columns - {{first~column}} and {{first_column}} - then we would have a duplicates. Would you agree? > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188653#comment-15188653 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi Jarek Jarcec Cecho, SQOOP-2839 : I think there won't be a problem in case of tables with columns PROTOCOL_VERSION and "_PROTOCOL_VERSION". In code we are adding an extra under score in the columns which starts with under score. Could please go through "toJavaIdentifier()" method in ClassWriter > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188652#comment-15188652 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi Jarek Jarcec Cecho, SQOOP-2839 : I think there won't be a problem in case of tables with columns PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under score in case of columns start with "". So the columns in the class will be in "__PROTOCOL_VERSION" and "___PROTOCOL_VERSION.". Could please go through "toJavaIdentifier()" method in ClassWriter > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188650#comment-15188650 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi Jarek Jarcec Cecho, SQOOP-2839 : I think there won't be a problem in case of tables with columns PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under score in case of columns start with "". So the columns in the class will be in "__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.". Could please go through "toJavaIdentifier()" method in ClassWriter > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188649#comment-15188649 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi Jarek Jarcec Cecho, SQOOP-2839 : I think there won't be a problem in case of tables with columns PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under score in case of columns start with "". So the columns in the class will be in "__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.". Could please go through "toJavaIdentifier()" method in ClassWriter > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188647#comment-15188647 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], SQOOP-2839 : I think there won't be a problem in case of tables with columns PROTOCOL_VERSION and _PROTOCOL_VERSION. Because we are adding an extra under score in case of columns start with "_". So the columns in the class will be in "_PROTOCOl_VERSION" and "__PROTOCOL_VERSION.". Could please go through "toJavaIdentifier()" method in ClassWriter > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188616#comment-15188616 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], Thanks for the valuable comment . We can start an other follow up JIRA and work on that . > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187546#comment-15187546 ] Jarek Jarcec Cecho commented on SQOOP-2561: --- Thanks for the reminder [~vishnusn]. I think that this solution wont't solve all the cases. For example if the table have columns {{first~column]] and {{first_column}}, then we again create duplicates. I've was looking into the [ClassWriter|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java] and it seems to me that it will hit the same problem, so I guess that we can leave it be for now and create follow up JIRA to solve that problem. What do you think [~vishnusn]? > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186608#comment-15186608 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], could you please review the code ? > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175633#comment-15175633 ] VISHNU S NAIR commented on SQOOP-2561: -- Hi [~jarcec], Could you please review the code? Thanks & regards Vishnu S Nair. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug >Affects Versions: 1.4.6 > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > Fix For: 1.4.7 > > Attachments: 0001-SQOOP-2561.patch > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175627#comment-15175627 ] VISHNU S NAIR commented on SQOOP-2561: -- Special Character removal from Column name as avro data results in duplicate column and fails the import Solution : Instead of removing the special character, replace it with underscore Corresponding test case is also included in the patch. Thanks & regards Vishnu S Nair. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug > Environment: cdh5.3.2 >Reporter: Suresh >Assignee: VISHNU S NAIR > Labels: AVRO, SQOOP > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import
[ https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737458#comment-14737458 ] Suresh commented on SQOOP-2561: --- This is the error i am getting, 15/09/07 10:52:47 ERROR sqoop.Sqoop: Got exception running Sqoop: org.apache.avro.AvroRuntimeException: Duplicate field CODIGO in record QueryResult: CODIGO type:UNION pos:15 and CODIGO type:UNION pos:13. org.apache.avro.AvroRuntimeException: Duplicate field CODIGO in record QueryResult: CODIGO type:UNION pos:15 and CODIGO type:UNION pos:13. at org.apache.avro.Schema$RecordSchema.setFields(Schema.java:636) at org.apache.sqoop.orm.AvroSchemaGenerator.generate(AvroSchemaGenerator.java:91) at org.apache.sqoop.mapreduce.DataDrivenImportJob.generateAvroSchema(DataDrivenImportJob.java:132) at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:90) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:262) at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:721) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:499) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:207) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:175) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:45) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:227) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) Where i have columns as CODIGO$ and COD$IGO. > Special Character removal from Column name as avro data results in duplicate > column and fails the import > > > Key: SQOOP-2561 > URL: https://issues.apache.org/jira/browse/SQOOP-2561 > Project: Sqoop > Issue Type: Bug > Components: connectors >Affects Versions: 1.4.5 > Environment: cdh5.3.2 >Reporter: Suresh > Labels: AVRO, SQOOP > Fix For: 1.4.5 > > > When a Special character like '$' or '#' are present in column name, > sqoop/avro removes those special character. In some cases it leads to > duplicate column. > e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and > creates the duplicate column as COL1 and it results in failure of the SQOOP > import job as a avro data. The same table can be loaded without > --as-avarodata flag. > The similar issue is raised in, > https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed > and the fix is creating this new issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)