[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188653#comment-15188653
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi Jarek Jarcec Cecho,

SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "_PROTOCOL_VERSION". In code we are adding an extra under 
score in the columns which starts with under score.

Could please go through "toJavaIdentifier()" method in ClassWriter

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VISHNU S NAIR updated SQOOP-2561:
-
Comment: was deleted

(was: Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOL_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter)

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188652#comment-15188652
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOL_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188650#comment-15188650
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VISHNU S NAIR updated SQOOP-2561:
-
Comment: was deleted

(was: Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter)

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VISHNU S NAIR updated SQOOP-2561:
-
Comment: was deleted

(was: Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter)

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188649#comment-15188649
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi Jarek Jarcec Cecho,
SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and "__PROTOCOL_VERSION". Because we are adding an extra under 
score in case of columns start with "". So the columns in the class will be in 
"__PROTOCOl_VERSION" and "___PROTOCOL_VERSION.".
Could please go through "toJavaIdentifier()" method in ClassWriter

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

VISHNU S NAIR updated SQOOP-2561:
-
Comment: was deleted

(was: Hi [~jarcec],

SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and _PROTOCOL_VERSION. Because we are adding an extra under 
score in case of columns start with "_". So the columns in the class will be in 
"_PROTOCOl_VERSION" and "__PROTOCOL_VERSION.".

Could please go through "toJavaIdentifier()" method in ClassWriter
)

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188647#comment-15188647
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi [~jarcec],

SQOOP-2839 : I think there won't be a problem in case of tables with columns 
PROTOCOL_VERSION and _PROTOCOL_VERSION. Because we are adding an extra under 
score in case of columns start with "_". So the columns in the class will be in 
"_PROTOCOl_VERSION" and "__PROTOCOL_VERSION.".

Could please go through "toJavaIdentifier()" method in ClassWriter


> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread VISHNU S NAIR (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188616#comment-15188616
 ] 

VISHNU S NAIR commented on SQOOP-2561:
--

Hi [~jarcec],

Thanks for the valuable comment . We can start an other follow up JIRA and work 
on that .



> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SQOOP-2826) Sqoop2: Doc: Auto-generate connector pages

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated SQOOP-2826:
--
Attachment: SQOOP-2826.patch

Uploading new version of the patch - still not ready to be reviewed as it's WIP 
that depends on half a dozen patches that are currently under review.

> Sqoop2: Doc: Auto-generate connector pages
> --
>
> Key: SQOOP-2826
> URL: https://issues.apache.org/jira/browse/SQOOP-2826
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2826.patch, SQOOP-2826.patch
>
>
> Our current pages describing connectors are heavily outdated. This happened 
> because we've added bunch of new configuration properties in various patches, 
> but forget to update the docs. Whereas we could force every connector 
> changing patch to also update the docs, I think that keeping the same 
> information on two places is tedious. Much better option would be to 
> automatically generate connector documentation pages from our code.
> To be very specific, I can see at least two areas where auto generating 
> content would really help:
> * [Generating input list for 
> connectors|http://sqoop.apache.org/docs/1.99.6/Connectors.html]
> * [Generating command line 
> parameters|http://sqoop.apache.org/docs/1.99.6/CommandLineClient.html]
> I'm sure that there will be others :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2495) Sqoop2: Provide simple test that can validate if connector is reasonably formed

2016-03-09 Thread Sqoop QA bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188224#comment-15188224
 ] 

Sqoop QA bot commented on SQOOP-2495:
-

Testing file 
[SQOOP-2495.patch|https://issues.apache.org/jira/secure/attachment/12792353/SQOOP-2495.patch]
 against branch sqoop2 took 0:01:30.294653.

{color:red}Overall:{color} -1 due to an error(s), see details below:

{color:green}SUCCESS:{color} Clean was successful
{color:green}SUCCESS:{color} Patch applied correctly
{color:green}SUCCESS:{color} Patch add/modify test case
{color:green}SUCCESS:{color} License check passed
{color:red}ERROR:{color} failed to build with patch (exit code 1, 
[report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2311/artifact/patch-process/install.txt])

Console output is available 
[here|https://builds.apache.org/job/PreCommit-SQOOP-Build/2311/console].

This message is automatically generated.

> Sqoop2: Provide simple test that can validate if connector is reasonably 
> formed
> ---
>
> Key: SQOOP-2495
> URL: https://issues.apache.org/jira/browse/SQOOP-2495
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.99.6
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2495.patch
>
>
> On internal hackathon we we're hacking Sqoop 2 connector with [~singhashish] 
> and we went through few troubles that we should address.
> We have a lot of requirements for Sqoop connectors that are only documented 
> but not enforced by the code, for example:
> * Resource bundles need to have names for all properties in configuration 
> objects
> * Configuration objects needs to be properly annotated
> If either of those is incorrect then we happily load the connector just to 
> throw some random exceptions during runtime. We should provide simple test 
> case that all connectors can reuse to validate that connector is properly 
> formed as Sqoop expects.
> (Which still doesn't mean that the connector will work as we can't guarantee 
> that extractor/loader is properly implemented. But we can at least help 
> people to not see random exceptions such as those described in SQOOP-2494).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2495) Sqoop2: Provide simple test that can validate if connector is reasonably formed

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188124#comment-15188124
 ] 

Jarek Jarcec Cecho commented on SQOOP-2495:
---

The attached patch indeed depends on all the JIRAs that are in "Depends upon" 
section, hence committing it before them doesn't make much sense.

> Sqoop2: Provide simple test that can validate if connector is reasonably 
> formed
> ---
>
> Key: SQOOP-2495
> URL: https://issues.apache.org/jira/browse/SQOOP-2495
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.99.6
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2495.patch
>
>
> On internal hackathon we we're hacking Sqoop 2 connector with [~singhashish] 
> and we went through few troubles that we should address.
> We have a lot of requirements for Sqoop connectors that are only documented 
> but not enforced by the code, for example:
> * Resource bundles need to have names for all properties in configuration 
> objects
> * Configuration objects needs to be properly annotated
> If either of those is incorrect then we happily load the connector just to 
> throw some random exceptions during runtime. We should provide simple test 
> case that all connectors can reuse to validate that connector is properly 
> formed as Sqoop expects.
> (Which still doesn't mean that the connector will work as we can't guarantee 
> that extractor/loader is properly implemented. But we can at least help 
> people to not see random exceptions such as those described in SQOOP-2494).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SQOOP-2495) Sqoop2: Provide simple test that can validate if connector is reasonably formed

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated SQOOP-2495:
--
Attachment: SQOOP-2495.patch

> Sqoop2: Provide simple test that can validate if connector is reasonably 
> formed
> ---
>
> Key: SQOOP-2495
> URL: https://issues.apache.org/jira/browse/SQOOP-2495
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.99.6
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2495.patch
>
>
> On internal hackathon we we're hacking Sqoop 2 connector with [~singhashish] 
> and we went through few troubles that we should address.
> We have a lot of requirements for Sqoop connectors that are only documented 
> but not enforced by the code, for example:
> * Resource bundles need to have names for all properties in configuration 
> objects
> * Configuration objects needs to be properly annotated
> If either of those is incorrect then we happily load the connector just to 
> throw some random exceptions during runtime. We should provide simple test 
> case that all connectors can reuse to validate that connector is properly 
> formed as Sqoop expects.
> (Which still doesn't mean that the connector will work as we can't guarantee 
> that extractor/loader is properly implemented. But we can at least help 
> people to not see random exceptions such as those described in SQOOP-2494).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 44595: SQOOP-2495: Sqoop2: Provide simple test that can validate if connector is reasonably formed

2016-03-09 Thread Jarek Cecho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44595/
---

Review request for Sqoop and Jarek Cecho.


Bugs: SQOOP-2495
https://issues.apache.org/jira/browse/SQOOP-2495


Repository: sqoop-sqoop2


Description
---

On internal hackathon we we're hacking Sqoop 2 connector with [~singhashish] 
and we went through few troubles that we should address.

We have a lot of requirements for Sqoop connectors that are only documented but 
not enforced by the code, for example:

* Resource bundles need to have names for all properties in configuration 
objects
* Configuration objects needs to be properly annotated

If either of those is incorrect then we happily load the connector just to 
throw some random exceptions during runtime. We should provide simple test case 
that all connectors can reuse to validate that connector is properly formed as 
Sqoop expects.

(Which still doesn't mean that the connector will work as we can't guarantee 
that extractor/loader is properly implemented. But we can at least help people 
to not see random exceptions such as those described in SQOOP-2494).


Diffs
-

  connector/connector-ftp/pom.xml a330266 
  
connector/connector-ftp/src/test/java/org/apache/sqoop/connector/ftp/TestFtpConnector.java
 PRE-CREATION 
  connector/connector-generic-jdbc/pom.xml 8d054c1 
  
connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestGenericJdbcConnector.java
 cc1c58f 
  connector/connector-hdfs/pom.xml 37cf3fa 
  
connector/connector-hdfs/src/test/java/org/apache/sqoop/connector/hdfs/TestHdfsConnector.java
 b41bd5a 
  connector/connector-kafka/pom.xml 5f41181 
  
connector/connector-kafka/src/main/java/org/apache/sqoop/connector/kafka/KafkaConnector.java
 2b03fa0 
  
connector/connector-kafka/src/test/java/org/apache/sqoop/connector/kafka/TestKafkaConnector.java
 PRE-CREATION 
  connector/connector-kite/pom.xml a492c5b 
  
connector/connector-kite/src/test/java/org/apache/sqoop/connector/kite/TestKiteConnector.java
 c28f697 
  connector/connector-oracle-jdbc/pom.xml 4262cb2 
  
connector/connector-oracle-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/oracle/TestOracleJdbcConnector.java
 PRE-CREATION 
  connector/connector-sdk-test/pom.xml PRE-CREATION 
  
connector/connector-sdk-test/src/main/java/org/apache/sqoop/connector/spi/SqoopConnectorAsserts.java
 PRE-CREATION 
  connector/connector-sftp/pom.xml 8db1af5 
  
connector/connector-sftp/src/test/java/org/apache/sqoop/connector/sftp/TestSftpConnector.java
 PRE-CREATION 
  connector/pom.xml 7340b37 
  pom.xml 891b2c9 

Diff: https://reviews.apache.org/r/44595/diff/


Testing
---


Thanks,

Jarek Cecho



[jira] [Assigned] (SQOOP-2495) Sqoop2: Provide simple test that can validate if connector is reasonably formed

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho reassigned SQOOP-2495:
-

Assignee: Jarek Jarcec Cecho

I'll take this one up. I've recently started changing the resource bundles and 
I found that we have no easily reusable test that I can run for all different 
connectors which is really bummer. Rather then doing some sort of short term 
solution, I'll provide patch for this one.

> Sqoop2: Provide simple test that can validate if connector is reasonably 
> formed
> ---
>
> Key: SQOOP-2495
> URL: https://issues.apache.org/jira/browse/SQOOP-2495
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.99.6
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
>
> On internal hackathon we we're hacking Sqoop 2 connector with [~singhashish] 
> and we went through few troubles that we should address.
> We have a lot of requirements for Sqoop connectors that are only documented 
> but not enforced by the code, for example:
> * Resource bundles need to have names for all properties in configuration 
> objects
> * Configuration objects needs to be properly annotated
> If either of those is incorrect then we happily load the connector just to 
> throw some random exceptions during runtime. We should provide simple test 
> case that all connectors can reuse to validate that connector is properly 
> formed as Sqoop expects.
> (Which still doesn't mean that the connector will work as we can't guarantee 
> that extractor/loader is properly implemented. But we can at least help 
> people to not see random exceptions such as those described in SQOOP-2494).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2882) Sqoop2: Enrich SFTP Connector resource file

2016-03-09 Thread Sqoop QA bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187827#comment-15187827
 ] 

Sqoop QA bot commented on SQOOP-2882:
-

Testing file 
[SQOOP-2882.patch|https://issues.apache.org/jira/secure/attachment/12792305/SQOOP-2882.patch]
 against branch sqoop2 took 1:38:50.723269.

{color:red}Overall:{color} -1 due to an error(s), see details below:

{color:green}SUCCESS:{color} Clean was successful
{color:green}SUCCESS:{color} Patch applied correctly
{color:red}ERROR:{color} Patch does not add/modify any test case
{color:green}SUCCESS:{color} License check passed
{color:green}SUCCESS:{color} Patch compiled
{color:green}SUCCESS:{color} All unit tests passed (executed 1702 tests)
{color:green}SUCCESS:{color} Test coverage did not decreased 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2309/artifact/patch-process/cobertura_report.txt])
{color:green}SUCCESS:{color} No new findbugs warnings 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2309/artifact/patch-process/findbugs_report.txt])
{color:red}ERROR:{color} Some of integration tests failed 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2309/artifact/patch-process/test_integration.txt],
 executed 82 tests)
* Test {{hive-tests}}
* Test {{org.apache.sqoop.integration.connector.hive.FromRDBMSToKiteHiveTest}}
* Test {{org.apache.sqoop.integration.connector.hdfs.S3Test}}



Console output is available 
[here|https://builds.apache.org/job/PreCommit-SQOOP-Build/2309/console].

This message is automatically generated.

> Sqoop2: Enrich SFTP Connector resource file
> ---
>
> Key: SQOOP-2882
> URL: https://issues.apache.org/jira/browse/SQOOP-2882
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2882.patch
>
>
> Please see parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2883) Sqoop2: Update model classes to represent new constant for connector resource bundles

2016-03-09 Thread Sqoop QA bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187825#comment-15187825
 ] 

Sqoop QA bot commented on SQOOP-2883:
-

Testing file 
[SQOOP-2883.patch|https://issues.apache.org/jira/secure/attachment/12792311/SQOOP-2883.patch]
 against branch sqoop2 took 1:13:53.010749.

{color:red}Overall:{color} -1 due to an error(s), see details below:

{color:green}SUCCESS:{color} Clean was successful
{color:green}SUCCESS:{color} Patch applied correctly
{color:green}SUCCESS:{color} Patch add/modify test case
{color:green}SUCCESS:{color} License check passed
{color:green}SUCCESS:{color} Patch compiled
{color:green}SUCCESS:{color} All unit tests passed (executed 1702 tests)
{color:green}SUCCESS:{color} Test coverage did not decreased 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2310/artifact/patch-process/cobertura_report.txt])
{color:orange}WARNING:{color} New findbugs warnings 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2310/artifact/patch-process/findbugs_report.txt])
* Package {{connector/connector-sdk}}: Class 
{{org.apache.sqoop.connector.spi.SqoopConnector}} introduced 2 completely new 
findbugs warnings.


{color:red}ERROR:{color} Some of integration tests failed 
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2310/artifact/patch-process/test_integration.txt],
 executed 80 tests)
* Test {{integration-tests}}
* Test {{org.apache.sqoop.integration.connector.hdfs.ParquetTest}}
* Test {{org.apache.sqoop.integration.connector.hdfs.S3Test}}



Console output is available 
[here|https://builds.apache.org/job/PreCommit-SQOOP-Build/2310/console].

This message is automatically generated.

> Sqoop2: Update model classes to represent new constant for connector resource 
> bundles
> -
>
> Key: SQOOP-2883
> URL: https://issues.apache.org/jira/browse/SQOOP-2883
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2883.patch
>
>
> As part of the parent umbrella JIRA, I've made few changes to the resource 
> bundles that connectors are providing:
> # Added a new key for example that we should add to 
> [{{MNamedElement}}|https://github.com/apache/sqoop/blob/sqoop2/common/src/main/java/org/apache/sqoop/model/MNamedElement.java]
> # Added a new key {{connector.name}} with human readable connector name. The 
> constant should be referred somewhere in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (SQOOP-2548) Sqoop2: RESTiliency: Enforce strict connector names

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho resolved SQOOP-2548.
---
Resolution: Won't Fix

As we've migrated from IDs to names only, this is no longer relevant concern.

> Sqoop2: RESTiliency: Enforce strict connector names
> ---
>
> Key: SQOOP-2548
> URL: https://issues.apache.org/jira/browse/SQOOP-2548
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
>
> We're using the connector name in URL as it is and we're reusing the same URL 
> to also retrieve connector by ID. Whereas our example connectors are written 
> in a way to not cause any troubles, I think that we should put down code 
> enforcing that the connectors are named in a way that won't cause any 
> problems down the road.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SQOOP-2883) Sqoop2: Update model classes to represent new constant for connector resource bundles

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)
Jarek Jarcec Cecho created SQOOP-2883:
-

 Summary: Sqoop2: Update model classes to represent new constant 
for connector resource bundles
 Key: SQOOP-2883
 URL: https://issues.apache.org/jira/browse/SQOOP-2883
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
 Fix For: 1.99.7
 Attachments: SQOOP-2883.patch

As part of the parent umbrella JIRA, I've made few changes to the resource 
bundles that connectors are providing:

# Added a new key for example that we should add to 
[{{MNamedElement}}|https://github.com/apache/sqoop/blob/sqoop2/common/src/main/java/org/apache/sqoop/model/MNamedElement.java]
# Added a new key {{connector.name}} with human readable connector name. The 
constant should be referred somewhere in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SQOOP-2883) Sqoop2: Update model classes to represent new constant for connector resource bundles

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated SQOOP-2883:
--
Attachment: SQOOP-2883.patch

> Sqoop2: Update model classes to represent new constant for connector resource 
> bundles
> -
>
> Key: SQOOP-2883
> URL: https://issues.apache.org/jira/browse/SQOOP-2883
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2883.patch
>
>
> As part of the parent umbrella JIRA, I've made few changes to the resource 
> bundles that connectors are providing:
> # Added a new key for example that we should add to 
> [{{MNamedElement}}|https://github.com/apache/sqoop/blob/sqoop2/common/src/main/java/org/apache/sqoop/model/MNamedElement.java]
> # Added a new key {{connector.name}} with human readable connector name. The 
> constant should be referred somewhere in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 44586: SQOOP-2883: Sqoop2: Update model classes to represent new constant for connector resource bundles

2016-03-09 Thread Jarek Cecho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44586/
---

Review request for Sqoop and Jarek Cecho.


Bugs: SQOOP-2883
https://issues.apache.org/jira/browse/SQOOP-2883


Repository: sqoop-sqoop2


Description
---

As part of the parent umbrella JIRA, I've made few changes to the resource 
bundles that connectors are providing:

# Added a new key for example that we should add to 
[{{MNamedElement}}|https://github.com/apache/sqoop/blob/sqoop2/common/src/main/java/org/apache/sqoop/model/MNamedElement.java]
# Added a new key {{connector.name}} with human readable connector name. The 
constant should be referred somewhere in the code.


Diffs
-

  common/src/main/java/org/apache/sqoop/model/MNamedElement.java 2e4dab3 
  common/src/test/java/org/apache/sqoop/model/TestMNamedElement.java 8556302 
  
connector/connector-sdk/src/main/java/org/apache/sqoop/connector/spi/SqoopConnector.java
 6733906 

Diff: https://reviews.apache.org/r/44586/diff/


Testing
---


Thanks,

Jarek Cecho



[jira] [Updated] (SQOOP-2882) Sqoop2: Enrich SFTP Connector resource file

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

 [ 
https://issues.apache.org/jira/browse/SQOOP-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated SQOOP-2882:
--
Attachment: SQOOP-2882.patch

> Sqoop2: Enrich SFTP Connector resource file
> ---
>
> Key: SQOOP-2882
> URL: https://issues.apache.org/jira/browse/SQOOP-2882
> Project: Sqoop
>  Issue Type: Sub-task
>Reporter: Jarek Jarcec Cecho
>Assignee: Jarek Jarcec Cecho
> Fix For: 1.99.7
>
> Attachments: SQOOP-2882.patch
>
>
> Please see parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SQOOP-2882) Sqoop2: Enrich SFTP Connector resource file

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)
Jarek Jarcec Cecho created SQOOP-2882:
-

 Summary: Sqoop2: Enrich SFTP Connector resource file
 Key: SQOOP-2882
 URL: https://issues.apache.org/jira/browse/SQOOP-2882
 Project: Sqoop
  Issue Type: Sub-task
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
 Fix For: 1.99.7
 Attachments: SQOOP-2882.patch

Please see parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 44585: SQOOP-2882: Sqoop2: Enrich SFTP Connector resource file

2016-03-09 Thread Jarek Cecho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44585/
---

Review request for Sqoop and Jarek Cecho.


Bugs: SQOOP-2882
https://issues.apache.org/jira/browse/SQOOP-2882


Repository: sqoop-sqoop2


Description
---

Please see parent JIRA for details.


Diffs
-

  connector/connector-sftp/src/main/resources/sftp-connector-config.properties 
c56c8e0 

Diff: https://reviews.apache.org/r/44585/diff/


Testing
---


Thanks,

Jarek Cecho



[jira] [Comment Edited] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187546#comment-15187546
 ] 

Jarek Jarcec Cecho edited comment on SQOOP-2561 at 3/9/16 6:08 PM:
---

Thanks for the reminder [~vishnusn].

I think that this solution wont't solve all the cases. For example if the table 
have columns {{first~column}} and {{first_column}}, then we again create 
duplicates. I've was looking into the 
[ClassWriter|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java]
 and it seems to me that it will hit the same problem, so I guess that we can 
leave it be for now and create follow up JIRA to solve that problem. What do 
you think [~vishnusn]?


was (Author: jarcec):
Thanks for the reminder [~vishnusn].

I think that this solution wont't solve all the cases. For example if the table 
have columns {{first~column]] and {{first_column}}, then we again create 
duplicates. I've was looking into the 
[ClassWriter|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java]
 and it seems to me that it will hit the same problem, so I guess that we can 
leave it be for now and create follow up JIRA to solve that problem. What do 
you think [~vishnusn]?

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SQOOP-2561) Special Character removal from Column name as avro data results in duplicate column and fails the import

2016-03-09 Thread Jarek Jarcec Cecho (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187546#comment-15187546
 ] 

Jarek Jarcec Cecho commented on SQOOP-2561:
---

Thanks for the reminder [~vishnusn].

I think that this solution wont't solve all the cases. For example if the table 
have columns {{first~column]] and {{first_column}}, then we again create 
duplicates. I've was looking into the 
[ClassWriter|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/orm/ClassWriter.java]
 and it seems to me that it will hit the same problem, so I guess that we can 
leave it be for now and create follow up JIRA to solve that problem. What do 
you think [~vishnusn]?

> Special Character removal from Column name as avro data results in duplicate 
> column and fails the import
> 
>
> Key: SQOOP-2561
> URL: https://issues.apache.org/jira/browse/SQOOP-2561
> Project: Sqoop
>  Issue Type: Bug
>Affects Versions: 1.4.6
> Environment: cdh5.3.2
>Reporter: Suresh
>Assignee: VISHNU S NAIR
>  Labels: AVRO, SQOOP
> Fix For: 1.4.7
>
> Attachments: 0001-SQOOP-2561.patch
>
>
> When a Special character like '$' or  '#' are present in column name, 
> sqoop/avro removes those special character. In some cases it leads to 
> duplicate column.
> e.g. If we have COL$1 and COL1$ in the schema, it removes both of them and 
> creates the duplicate column as COL1 and it results in failure of the SQOOP 
> import job as a avro data. The same table can be loaded without 
> --as-avarodata flag.
> The similar issue is raised in, 
> https://issues.apache.org/jira/browse/SQOOP-1361 - which i suppose is fixed 
> and the fix is creating this new issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)