[jira] [Commented] (SQOOP-2411) Sqoop using '--direct' option fails with mysqldump exit code 2 and 3

2017-08-14 Thread Anna Szonyi (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125598#comment-16125598
 ] 

Anna Szonyi commented on SQOOP-2411:


Hi [~sanysand...@gmail.com],

Thanks for following up on this jira!

In general we should only close these types of jiras if we know that we can't 
solve the issue from the Sqoop side/it's an expected failure/not a problem. 
However it might be a question around the cause of the exception: is the 
logging sufficient for the end user to tell what the root cause was, etc. Also 
it's a question of whether increasing 'net-write-timeout' or 'net-read-timeout' 
should solve these and if it's just a question of increasing it further (to how 
much) or if we're not passing it correctly (it's a bug on our end) or it 
doesn't have the desired effect (maybe a doc update).

In general if you could reproduce the issue, and think it's solvable, this 
could be an improvement to potentially improve logging or solve the time out 
issues/check whether the net-read-timeout increasing helps (or a doc jira about 
usage).

Thanks,
Anna

> Sqoop using '--direct' option fails with mysqldump exit code 2 and 3
> 
>
> Key: SQOOP-2411
> URL: https://issues.apache.org/jira/browse/SQOOP-2411
> Project: Sqoop
>  Issue Type: Bug
>  Components: connectors/mysql
>Affects Versions: 1.4.6
> Environment: Amazon EMR
>Reporter: Karthick H
>Assignee: Sandish Kumar HN
>Priority: Critical
>
> I am running Sqoop in AWS EMR. I am trying to copy a table ~10 GB from MySQL 
> into HDFS.
> I get the following exception
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_00_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 3
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:07 INFO mapreduce.Job: Task Id : 
> attempt_1435664372091_0048_m_05_2, Status : FAILED
> Error: java.io.IOException: mysqldump terminated with status 2
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:485)
> at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
> 15/07/06 12:19:08 INFO mapreduce.Job:  map 0% reduce 0%
> 15/07/06 12:19:20 INFO mapreduce.Job:  map 25% reduce 0%
> 15/07/06 12:19:22 INFO mapreduce.Job:  map 38% reduce 0%
> 15/07/06 12:19:23 INFO mapreduce.Job:  map 50% reduce 0%
> 15/07/06 12:19:24 INFO mapreduce.Job:  map 75% reduce 0%
> 15/07/06 12:19:25 INFO mapreduce.Job:  map 100% reduce 0%
> 15/07/06 12:23:11 INFO mapreduce.Job: Job job_1435664372091_0048 failed with 
> state FAILED due to: Task failed task_1435664372091_0048_m_00
> Job failed as tasks failed. failedMaps:1 failedReduces:0
> 15/07/06 12:23:11 INFO mapreduce.Job: Counters: 8
> Job Counters 
> Failed map tasks=28
> Launched map tasks=28
> Other local map tasks=28
> Total time spent by all maps in occupied slots (ms)=34760760
> Total time spent by all reduces in occupied slots (ms)=0
> Total time spent by all map tasks (ms)=5793460
> Total vcore-seconds taken by all map tasks=5793460
> Total megabyte-seconds taken by all map tasks=8342582400
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group FileSystemCounters is 
> deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 15/07/06 12:23:11 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 
> 829.8697 seconds (0 bytes/sec)
> 15/07/06 12:23:11 WARN mapreduce.Counters: Group   
> org.apache.hadoop.mapred.Task$Counter is deprecated. Use 
> org.apache.hadoop.mapreduce.TaskCounter 

[jira] [Commented] (SQOOP-3186) Add Sqoop1 (import + --incremental + --check-column) support for functions/expressions

2017-08-14 Thread Eric Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/SQOOP-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125498#comment-16125498
 ] 

Eric Lin commented on SQOOP-3186:
-

[~BoglarkaEgyed],

Sorry, I forgot to add to review. Now done: https://reviews.apache.org/r/61615/

I am still working on adding new test case, however, due to lack of support for 
having functions inside aggregation function call in HSQL, like 
SUM(COALEASE(col1, 1)), I can't get my test going. So I am still working on it.

Can you please help to at least review if my change so far makes sense?

Thanks

> Add Sqoop1 (import + --incremental + --check-column) support for 
> functions/expressions
> --
>
> Key: SQOOP-3186
> URL: https://issues.apache.org/jira/browse/SQOOP-3186
> Project: Sqoop
>  Issue Type: Improvement
>Reporter: Markus Kemper
>Assignee: Eric Lin
> Attachments: SQOOP-3186.patch
>
>
> Add Sqoop1 (import + --incremental + --check-column) support for 
> functions/expressions, for example:
> *Example*
> {noformat}
> sqoop import \
> --connect $MYCONN --username $MYUSER --password $MYPSWD \
> --table T1 --target-dir /path/directory --merge-key C1 \
> --incremental lastmodified  --last-value '2017-01-01 00:00:00.0' \
> --check-column nvl(C4,to_date('2017-01-01 00:00:00')
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 61615: Add Sqoop1 (import + --incremental + --check-column) support for functions/expressions

2017-08-14 Thread Eric Lin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61615/
---

Review request for Sqoop and Boglarka Egyed.


Bugs: SQOOP-3186
https://issues.apache.org/jira/browse/SQOOP-3186


Repository: sqoop-trunk


Description
---

Add Sqoop1 (import + --incremental + --check-column) support for 
functions/expressions, for example:
Example
sqoop import \
--connect $MYCONN --username $MYUSER --password $MYPSWD \
--table T1 --target-dir /path/directory --merge-key C1 \
--incremental lastmodified  --last-value '2017-01-01 00:00:00.0' \
--check-column nvl(C4,to_date('2017-01-01 00:00:00')


Diffs
-

  src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 891ed4d 
  src/java/org/apache/sqoop/SqoopOptions.java 2eb3d8a 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 1564bdc 
  src/java/org/apache/sqoop/tool/ImportTool.java 807ec8c 


Diff: https://reviews.apache.org/r/61615/diff/1/


Testing
---

Manual testing, still working on adding test cases.

This is first iteration, so code still need to be refined. I will update to 
latest trunk once suggestions are made.


Thanks,

Eric Lin