[jira] [Created] (TRAFODION-1578) Proposal for SPJ management
Kevin Xu created TRAFODION-1578: --- Summary: Proposal for SPJ management Key: TRAFODION-1578 URL: https://issues.apache.org/jira/browse/TRAFODION-1578 Project: Apache Trafodion Issue Type: Improvement Components: connectivity-dcs Reporter: Kevin Xu JAR upload process: 1. Initialize JAR upload procedure by default 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and create library will be done here. And also, you can only upload the JARs by UPLOAD command on Trafci that it will not create a lib. Tips: Before put the JAR into HDFS check MD5 first, if the file exists, only add a record in metadata table in case users upload the same JAR many times on platform. 3. On server-side, the JAR will store in HDFS. At the same time JAR metadata(path in HDFS, MD5 of the file, and others) stores in store procedure metadata table. 4. create procedure is the same as now. JAR perform process: 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. 2. DCSMaster assign a DCSServer for the CALL. 3. DCSServer start a JVM for the user. User can modify JVM options, program properties and JAVA classpath. At the same time, a monitor class will be starting in the JVM witch will register a node on Zookeeper for this JVM as well as metadata info( process id, server info and so on) and the node will be removed while JVM exiting. It allows customer to specify JVM idle time in case of some realtime senarior like Kafka consumer. 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1579) Clean up test scripts
[ https://issues.apache.org/jira/browse/TRAFODION-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987785#comment-14987785 ] ASF GitHub Bot commented on TRAFODION-1579: --- GitHub user hegdean opened a pull request: https://github.com/apache/incubator-trafodion/pull/159 [TRAFODION-1579] Clean up test scripts You can merge this pull request into a Git repository by running: $ git pull https://github.com/hegdean/incubator-trafodion wrk-brnch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-trafodion/pull/159.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #159 commit 0ce8b15c094a1e43e13eec67e6d4f35c898533f8 Author: Anuradha HegdeDate: 2015-11-03T17:53:50Z Cleanup the scripts to remove HP names commit 3bc1a7f1c43096c27c0317a26847ecc6b6be8bbe Author: Anuradha Hegde Date: 2015-11-03T17:58:22Z Merge branch 'master' of github.com:apache/incubator-trafodion into wrk-brnch Conflicts: core/sqf/sql/scripts/install_traf_components > Clean up test scripts > - > > Key: TRAFODION-1579 > URL: https://issues.apache.org/jira/browse/TRAFODION-1579 > Project: Apache Trafodion > Issue Type: Bug > Components: connectivity-mxosrvr >Reporter: Anuradha Hegde > > The ODBC test templates have HP names for datasource and default service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987928#comment-14987928 ] Venkat Muthuswamy commented on TRAFODION-1578: -- 1. I suggest that we keep the interface generic not just for JARs. It should allow to deploy UDF files too. 2. Storing SPJs in user specific directories can be a problem. You cannot share it with other users. Often times, a SPJ developer or admin will deploy the SPJs and other users/application will call the SPJ. So limiting access to the user who deployed it will not work. The security concerns can be addressed using the SQL privileges on the Library object which encapsulates the jar or udf file. We had a similar design in the pre-apache product and we had to revert to using a single directory or store because it became too cumbersome to manage between roles and users and public. 3. We should keep in mind the ODBC buffer limits and the size of the jar. The client (TrafCI) here will have to send the jar contents in chunks and the SPJ has to assemble them into the single LOB or HDFS file. 4. I have a security concern about the additional commands being proposed: "list all JVMs in user; kill one of them that no long in use; Restart JVMs with latest JARs and so on". How will you enforce security here. Consider leveraging and extending the Trafodion REST server for these kind of manageability functions. These features do not seem to belong in TrafCI. 5. Make sure the proposed interfaces prevent any Denial of Service attacks and adds necessary checks to limit size of the file etc.. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1579) Clean up test scripts
Anuradha Hegde created TRAFODION-1579: - Summary: Clean up test scripts Key: TRAFODION-1579 URL: https://issues.apache.org/jira/browse/TRAFODION-1579 Project: Apache Trafodion Issue Type: Bug Components: connectivity-mxosrvr Reporter: Anuradha Hegde The ODBC test templates have HP names for datasource and default service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1579) Clean up test scripts
[ https://issues.apache.org/jira/browse/TRAFODION-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988391#comment-14988391 ] ASF GitHub Bot commented on TRAFODION-1579: --- Github user hegdean closed the pull request at: https://github.com/apache/incubator-trafodion/pull/159 > Clean up test scripts > - > > Key: TRAFODION-1579 > URL: https://issues.apache.org/jira/browse/TRAFODION-1579 > Project: Apache Trafodion > Issue Type: Bug > Components: connectivity-mxosrvr >Reporter: Anuradha Hegde > > The ODBC test templates have HP names for datasource and default service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
Hans Zeller created TRAFODION-1580: -- Summary: Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs Key: TRAFODION-1580 URL: https://issues.apache.org/jira/browse/TRAFODION-1580 Project: Apache Trafodion Issue Type: New Feature Components: sql-general Affects Versions: 1.3-incubating Reporter: Hans Zeller This is a JIRA for multiple subtasks related to data integration with other Apache projects that can function as data sources or data sinks for Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping UDFs). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TRAFODION-1582) Install Apache drill as an optional add-on in install_local_hadoop
[ https://issues.apache.org/jira/browse/TRAFODION-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TRAFODION-1582 started by Hans Zeller. -- > Install Apache drill as an optional add-on in install_local_hadoop > -- > > Key: TRAFODION-1582 > URL: https://issues.apache.org/jira/browse/TRAFODION-1582 > Project: Apache Trafodion > Issue Type: Sub-task > Components: sql-general >Affects Versions: 1.3-incubating >Reporter: Hans Zeller >Assignee: Hans Zeller > Fix For: 2.0-incubating > > > Add a new script, install_local_drill, to download and install Apache Drill, > similar to how we install Hadoop and HBase. > For Drill this is might be relatively easy, just download the tar file and > extract it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output
[ https://issues.apache.org/jira/browse/TRAFODION-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TRAFODION-1581 started by Hans Zeller. -- > Add a TMUDF that can return a JDBC result set as table-valued output > > > Key: TRAFODION-1581 > URL: https://issues.apache.org/jira/browse/TRAFODION-1581 > Project: Apache Trafodion > Issue Type: Sub-task > Components: sql-general >Affects Versions: 1.3-incubating >Reporter: Hans Zeller >Assignee: Hans Zeller > Fix For: 2.0-incubating > > > One way to read data from other data sources would be a Trafodion TMUDF that > takes a connection string, an SQL statement and other necessary info as an > input, connects to a JDBC data source, prepares the statement, and returns > the result set as a table-valued output. This would enable a basic connector > for many data sources, including Spark, Drill and Kafka. > Specifically, I would like to add a "predefined" TMUDF to Trafodion that > takes the following parameters: > 1. The name of a jar with a JDBC driver. > 2. A connection string to use > 3. The class name of the driver > 4. A user id > 5. A password > 6. The type of processing to do (right now only one type is supported) > 7. Info depending on the type. > The first type of processing I would like to add is "source", and it does the > following: It accepts a list of SQL statements to execute. Only one of these > statements can return a result set. The data in the result set will be > returned as table-valued output. > Future processing types could do a parallel select like ODB does or they > could insert into a table on the system identified by the JDBC driver info. > All parameters need to be compile-time constants, so that the UDF can connect > to the data source at compile time and prepare the statement. Based on the > prepared statement, it will determine number, names and SQL types of the > column(s) of the table-valued result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
[ https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TRAFODION-1580 started by Hans Zeller. -- > Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs > - > > Key: TRAFODION-1580 > URL: https://issues.apache.org/jira/browse/TRAFODION-1580 > Project: Apache Trafodion > Issue Type: New Feature > Components: sql-general >Affects Versions: 1.3-incubating >Reporter: Hans Zeller >Assignee: Hans Zeller > > This is a JIRA for multiple subtasks related to data integration with other > Apache projects that can function as data sources or data sinks for > Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping > UDFs). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1583) Install Apache Spark as an optional add-on in install_local_hadoop
Hans Zeller created TRAFODION-1583: -- Summary: Install Apache Spark as an optional add-on in install_local_hadoop Key: TRAFODION-1583 URL: https://issues.apache.org/jira/browse/TRAFODION-1583 Project: Apache Trafodion Issue Type: Sub-task Affects Versions: 1.3-incubating Reporter: Hans Zeller Fix For: 2.0-incubating Optionally install a local instance of Spark, so that we can use it to test integration between Trafodion and Spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output
Hans Zeller created TRAFODION-1581: -- Summary: Add a TMUDF that can return a JDBC result set as table-valued output Key: TRAFODION-1581 URL: https://issues.apache.org/jira/browse/TRAFODION-1581 Project: Apache Trafodion Issue Type: Sub-task Affects Versions: 1.3-incubating Reporter: Hans Zeller Assignee: Hans Zeller Fix For: 2.0-incubating One way to read data from other data sources would be a Trafodion TMUDF that takes a connection string, an SQL statement and other necessary info as an input, connects to a JDBC data source, prepares the statement, and returns the result set as a table-valued output. This would enable a basic connector for many data sources, including Spark, Drill and Kafka. Specifically, I would like to add a "predefined" TMUDF to Trafodion that takes the following parameters: 1. The name of a jar with a JDBC driver. 2. A connection string to use 3. The class name of the driver 4. A user id 5. A password 6. The type of processing to do (right now only one type is supported) 7. Info depending on the type. The first type of processing I would like to add is "source", and it does the following: It accepts a list of SQL statements to execute. Only one of these statements can return a result set. The data in the result set will be returned as table-valued output. Future processing types could do a parallel select like ODB does or they could insert into a table on the system identified by the JDBC driver info. All parameters need to be compile-time constants, so that the UDF can connect to the data source at compile time and prepare the statement. Based on the prepared statement, it will determine number, names and SQL types of the column(s) of the table-valued result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1584) Install Apache Kafka as an optional add-on in install_local_hadoop
Hans Zeller created TRAFODION-1584: -- Summary: Install Apache Kafka as an optional add-on in install_local_hadoop Key: TRAFODION-1584 URL: https://issues.apache.org/jira/browse/TRAFODION-1584 Project: Apache Trafodion Issue Type: Sub-task Components: sql-general Affects Versions: 1.3-incubating Reporter: Hans Zeller Fix For: 2.0-incubating Optionally install Apache Kafka so that we can test integration between Kafka and Trafodion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work stopped] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
[ https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TRAFODION-1580 stopped by Hans Zeller. -- > Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs > - > > Key: TRAFODION-1580 > URL: https://issues.apache.org/jira/browse/TRAFODION-1580 > Project: Apache Trafodion > Issue Type: New Feature > Components: sql-general >Affects Versions: 1.3-incubating >Reporter: Hans Zeller >Assignee: Hans Zeller > > This is a JIRA for multiple subtasks related to data integration with other > Apache projects that can function as data sources or data sinks for > Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping > UDFs). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
[ https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on TRAFODION-1580 started by Hans Zeller. -- > Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs > - > > Key: TRAFODION-1580 > URL: https://issues.apache.org/jira/browse/TRAFODION-1580 > Project: Apache Trafodion > Issue Type: New Feature > Components: sql-general >Affects Versions: 1.3-incubating >Reporter: Hans Zeller >Assignee: Hans Zeller > > This is a JIRA for multiple subtasks related to data integration with other > Apache projects that can function as data sources or data sinks for > Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping > UDFs). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987369#comment-14987369 ] liu ming commented on TRAFODION-1578: - I like this idea very much. It allows user to deploy SPJ via a simple trafci command, since itself is a SPJ, so it is also very easy to integrated into a GUI manager. Need to discuss a little more: 1. Put DLL/jar as LOB or as HDFS file? HDFS file seems very good to me. 2. Location of DLL/jar at server side. $MY_SQROOT/export/lib/udr/cache is a good place, and as Kevin proposed, there are benefit to put DLL/jar files in directories per user. So we can have finer control, user A cannot call user B’s SPJ for example. So maybe we can upload the jar to $MY_SQROOT/export/lib/udr/cache/$USER If the SPJ/UDR need extra DLL/jar files, we can simply put into $MY_SQROOT/export/lib along with the SPJ jar, $MY_SQROOT/export/lib is already in CLASSPATH and LD_LIBRARY_PATH. Or we can ask the developer of UDR/SPJ to pack all required DLL/jar into a single file. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)