[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2010-01-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796710#action_12796710
 ] 

Hudson commented on MAPREDUCE-1224:
---

Integrated in Hadoop-Mapreduce-trunk #196 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/196/])


 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
Assignee: Spencer Ho
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-30 Thread Spencer Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783776#action_12783776
 ] 

Spencer Ho commented on MAPREDUCE-1224:
---

@Aaron,
This particular case that triggered the patch submission is for Microsoft SQL 
Server.  For MySQL, I am using direct mode which works for most of the cases.

 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-30 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783810#action_12783810
 ] 

Aaron Kimball commented on MAPREDUCE-1224:
--

Good to know that this works with SQL Server as well. Thanks for the patch.

 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-27 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783192#action_12783192
 ] 

Aaron Kimball commented on MAPREDUCE-1224:
--

@Jeff Sqoop is already using the ResultSetMetaData associated with the query, 
rather than trying to read the DatabaseMetaData directly. Especially when we 
eventually support arbitrary user-supplied queries, this will be necessary. It 
can also be tricky to set all the parameters for a DatabaseMetaData correctly 
in a generic way. But to get at ResultSetMetaData (which definitely includes 
the proper typing information), a query must be submitted.

@Spenser This is a good catch and improvement! What database are you testing 
against? This patch passes unit tests against HSQLDB, PostgreSQL, and Oracle, 
so +1 from me. 

For PostgreSQL and MySQL, Sqoop uses {{connection.setFetchSize()}} to specify a 
row-buffered (rather than table-buffered) result, so it returns fast. But 
unfortunately, {{setFetchSize()}} is, like everything else in JDBC, poorly 
specified, so there isn't a good way to do this generically. This is a good way 
to ensure that the query returns quickly even if the database does not respect 
a row-buffered connection.


 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-20 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780867#action_12780867
 ] 

Jeff Hammerbacher commented on MAPREDUCE-1224:
--

Should we try using actual JDBC metadata calls first? See, e.g., 
http://blog.codebeach.com/2008/12/database-metadata-with-jdbc.html

 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780906#action_12780906
 ] 

Hadoop QA commented on MAPREDUCE-1224:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12425689/MAPREDUCE-1224.patch
  against trunk revision 882790.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/256/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/256/console

This message is automatically generated.

 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho
 Attachments: MAPREDUCE-1224.patch, SqlManager.java


 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1224) Calling SELECT t.* from table AS t to get meta information is too expensive for big tables

2009-11-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780349#action_12780349
 ] 

Todd Lipcon commented on MAPREDUCE-1224:


Perhaps this could be changed to add WHERE 1 = 0. Any SQL optimizer should 
evaluate this very quickly and return an empty result set, allowing metadata to 
be grabbed without actually doing work. Aaron?

 Calling SELECT t.* from table AS t to get meta information is too 
 expensive for big tables
 --

 Key: MAPREDUCE-1224
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1224
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Affects Versions: 0.20.1
 Environment: all platforms, generic jdbc driver
Reporter: Spencer Ho

 The SqlManager uses the query, SELECT t.* from table AS t to get table 
 spec is too expensive for big tables, and it was called twice to generate 
 column names and types.  For tables that are big enough to be map-reduced, 
 this is too expensive to make sqoop useful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.