[jira] [Commented] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output

ASF GitHub Bot (JIRA) Wed, 09 Dec 2015 09:46:56 -0800

    [ 
https://issues.apache.org/jira/browse/TRAFODION-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049046#comment-15049046
 ]


ASF GitHub Bot commented on TRAFODION-1581:
-------------------------------------------

GitHub user zellerh opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/218

    [TRAFODION-1581] TMUDF for JDBC queries

    This pull request has 3 initial commits:
    
    [TRAFODION-1581] TMUDF for JDBC queries
    [TRAFODION-1672] Failures when running regressions twice
    [TRAFODION-1695] ORDER BY on divisioned table w/o sort
    
    
    The 1581 commit also contains:
    
    [TRAFODION-1582] Optional Drill install for local hadoop
    
    Other small fixes:
    
    - install_local_hadoop now picks ports in the non-ephemeral
      range when doing install_local_hadoop -p fromDisplay.
      Port number range starts at 24000 + 200*display number.
      Stay below 42 for your display to avoid the ephemeral range
      and pick a different display # if you run into port conflicts.
    
    - Setupdir step (this includes building libhdfs, if needed)
      now logs its output with a suffix ##(setupdir) like most
      other components
    
      core/Makefile
    
    - Fixed a bug causing a core when a TMUDF produced no
      output columns and another bug with a VARCHAR parameter
      at the beginning of a parameter list.
    
      core/optimizer/UdfDllInteraction.cpp
    
    - Fixed a bug in parsing two consecutive patterns in sqlci
      input, like $$a$$$$b$$
    
      core/sqlci/SqlCmd.cpp
    
    - Small fixes to doxygen documentation: With the new web
      page structure, make doxygen version match the Trafodion
      version (at least the initial release that applies), also
      fix links to wiki to point to the Apache wiki.
    
      core/sql/sqludr/doxygen_tmudr.1.6.config
      core/sql/sqludr/sqludr.cpp
      core/sql/sqludr/sqludr.h
    
    - Make hive-exec dependency for SQL explicit (would sometimes
      produce an error depending on the sequence in which things
      are built)
    
      core/sql/pom.xml
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zellerh/incubator-trafodion bug/1581

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/218.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #218
    
----
commit 680900cd6e2f5daa959b449d12cda3aa7d8a7853
Author: Hans Zeller <zell...@dev02.trafodion.org>
Date:   2015-09-08T20:48:05Z

    [TRAFODION-1695] Optimize ORDER BY and GROUP BY with salt and divisioning
    
    For queries that access only a single salt bucket or a single division,
    we should be able to produce the rows of the table in order of the
    primary key. This change adds some optimizations to do that.
    
    Predicates on computed columns are used to find queries that access only
    a single salt or division value. The method to match actual with required
    orders recognizes these predicates and also stores them in the group
    attributes, so that later checks again recognize the optimization.

commit 849de1a32c34c11e16ace6d9a4d4064148d594be
Author: Hans Zeller <hzel...@apache.org>
Date:   2015-12-09T16:58:39Z

    [TRAFODION-1672] Failures when running regressions twice
    
    Fixing tests executor/TEST009 and executor/TEST130 so they
    clean up correctly at the end and can be run more than once.

commit a34771927c1afd3348fb4609b4c2944f5c47507a
Author: Hans Zeller <hzel...@apache.org>
Date:   2015-12-09T17:17:25Z

    [TRAFODION-1581] TMUDF for JDBC queries
    
    Other JIRAs fixed with this commit:
    
    [TRAFODION-1582] Optional Drill install for local hadoop
    
    This new built-in TMUDF takes arguments that describe a
    JDBC connection and a list of SQL statements and returns
    the result of the one SQL statement in the list that
    produces results:
    
       select ... from udf(JDBC(
          <name of JDBC driver jar>,
          <name of JDBC driver class in the jar>,
          <connection string>,
          <user name>,
          <password>,
          <statement_type>,
          <sql statement 1>
          [ , <sql statements 2 ...n> ] )) ...
    
       The first 7 arguments are required and must be
       string literals that are available at compile
       time.
       Statement type:
          'source': This statement produces a result
                    (only type allowed at this time)
                    (may support "target" to insert
                     into a table via JDBC later)
    
       Note that only one of the SQL statements can be
       a select or other result-producing statements.
       The others can perform setup and cleanup
       operations, if necessary (e.g. create table,
       insert, select, drop table).
    
       For an example, see file
       core/sql/regress/udr/TEST002.
    
    Note that this UDF is still a prototype, it needs more
    testing.
    
    Other small fixes:
    
    - install_local_hadoop now picks ports in the non-ephemeral
      range when doing install_local_hadoop -p fromDisplay.
      Port number range starts at 24000 + 200*display number.
      Stay below 42 for your display to avoid the ephemeral range
      and pick a different display # if you run into port conflicts.
    
    - Setupdir step (this includes building libhdfs, if needed)
      now logs its output with a suffix ##(setupdir) like most
      other components
    
      core/Makefile
    
    - Fixed a bug causing a core when a TMUDF produced no
      output columns and another bug with a VARCHAR parameter
      at the beginning of a parameter list.
    
      core/optimizer/UdfDllInteraction.cpp
    
    - Fixed a bug in parsing two consecutive patterns in sqlci
      input, like $$a$$$$b$$
    
      core/sqlci/SqlCmd.cpp
    
    - Small fixes to doxygen documentation: With the new web
      page structure, make doxygen version match the Trafodion
      version (at least the initial release that applies), also
      fix links to wiki to point to the Apache wiki.
    
      core/sql/sqludr/doxygen_tmudr.1.6.config
      core/sql/sqludr/sqludr.cpp
      core/sql/sqludr/sqludr.h
    
    - Make hive-exec dependency for SQL explicit (would sometimes
      produce an error depending on the sequence in which things
      are built)
    
      core/sql/pom.xml

----


> Add a TMUDF that can return a JDBC result set as table-valued output
> --------------------------------------------------------------------
>
>                 Key: TRAFODION-1581
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1581
>             Project: Apache Trafodion
>          Issue Type: Sub-task
>          Components: sql-general
>    Affects Versions: 1.3-incubating
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>             Fix For: 2.0-incubating
>
>
> One way to read data from other data sources would be a Trafodion TMUDF that 
> takes a connection string, an SQL statement and other necessary info as an 
> input, connects to a JDBC data source, prepares the statement, and returns 
> the result set as a table-valued output. This would enable a basic connector 
> for many data sources, including Spark, Drill and Kafka.
> Specifically, I would like to add a "predefined" TMUDF to Trafodion that 
> takes the following parameters:
> 1. The name of a jar with a JDBC driver.
> 2. A connection string to use
> 3. The class name of the driver
> 4. A user id
> 5. A password
> 6. The type of processing to do (right now only one type is supported)
> 7. Info depending on the type.
> The first type of processing I would like to add is "source", and it does the 
> following: It accepts a list of SQL statements to execute. Only one of these 
> statements can return a result set. The data in the result set will be 
> returned as table-valued output.
> Future processing types could do a parallel select like ODB does or they 
> could insert into a table on the system identified by the JDBC driver info.
> All parameters need to be compile-time constants, so that the UDF can connect 
> to the data source at compile time and prepare the statement. Based on the 
> prepared statement, it will determine number, names and SQL types of the 
> column(s) of the table-valued result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output

Reply via email to