[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996155#comment-14996155 ] Hans Zeller commented on TRAFODION-1578: The JVMs used for the Trafodion server serve mainly the Trafodion engine and they need to execute multiple UDRs, since we don't know which UDRs will be executed when we start the master executor: 1. User connecs. A JVM is reused or started for the Trafodion engine. We don't know yet which UDRs the user will execute. 2. User calls SPJ a. At this time the JVM is already running. 3. User calls TMUDF b. Now we have to reuse the same JVM, if we don't use an MXUDR server. Are you saying that a user should specify at connect time how much heap size they want to have in the JVM used in their connection? We could probably do a limited set of options like that. Often it's probably the creator of a UDR, not the user of it, who may want to specify heap size and other parameters of the JVM used to execute a UDR. End users may not want to have to worry about that. Having a separate JVM for the UDRs should help avoid conflicts between Trafodion and UDRs. Thanks for your patience and your explanations, I hope I understand your proposal to use sandboxing with a SecurityManager and setting JVM parameters a bit better and it makes more sense to me now. I still have questions, though, see comment on the thread above, about the SecurityManager being too restrictive for many UDRs. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996145#comment-14996145 ] Hans Zeller commented on TRAFODION-1578: Thanks for the explanation and the link, yesterday I didn't realize that the SecurityManager can also be used to sandbox a piece of Java code. I could imagine that this could work for simple functions, especially for scalar UDFs that usually do simple, localized operations. I don't think it would work for TMUDFs. It might work for SPJs that do only Trafodion SQL and other simple operations. What about Java UDRs that want to access HDFS or HBase or other JDBC drivers? What about UDRs that call other code like Spark, Kafka or MongoDB? > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996143#comment-14996143 ] Kevin Xu commented on TRAFODION-1578: - One JVM(process) per Trafodion user by default. User knows how much heap size they need well than us. By the way, it may set limitations on each JVM and only some authorized options can be modified. i'm not saying that it'll have same JVM with DCSServer. DCSServer will create a JVM with a customized IDLE timeout(user may make decision that how long it is existing, maybe never stop). As i said, the new path will be DCSMaster->DCSServer->create new JVM for current user if not exist. Nothing about UDR. Yes, regular users dont have permission to kill DCSServers. They have permission to start/stop SPJ JVM( one for each user by default). > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996115#comment-14996115 ] Kevin Xu commented on TRAFODION-1578: - For authority, the checking should be done in DCSServer( check whether current user has privilege to call) by querying metadata tables. One JVM(process) per user by default. For arbitrary code, JVM SecurityManager should be helpful. 1. FilePermission in SecurityManager should be helpful. 2. It should be solved if started JVM only has specified folder priviledge. Eventually, the JARs should be existing in the running node and it's fine to copy from HDFS at the first time. It's very good to hear that Venkat will add a GUI function. 3. If JVM behaves are under control, it's not necessary to run a process with a special user ids(one user id per connection? if not, it will impact each other. Otherwise, you will have many user ids to be created.) Refs: https://en.wikipedia.org/wiki/Java_security#Security_manager http://docs.oracle.com/javase/7/docs/technotes/guides/security/overview/jsoverview.html > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995948#comment-14995948 ] Kevin Xu edited comment on TRAFODION-1578 at 11/9/15 5:04 AM: -- For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's DCSMaster->DCSServer->SQ Engine -> UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA -> JAVA). it's not necessary to create a new JVM for each connection. But for single JVM, it should load the JARs dynamicly as what UDR is doing and has security issue. So the proposal is to start a new JVM and manage its security by Java SecurityMangement. Contains SQL is trying to reuse JVM, right? As i mentioned, there is a customized idle timeout( contains non-stop) for standing by. yes, the users can modify JVM options directly, but it will have limitation on customized options like max heap size of JVM, the number of connections for one user. And also, allowed to start/restart. i'd like to do an experiment on a new command 'agent'. was (Author: kevinxu021): For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's DCSMaster->DCSServer->UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA -> JAVA). it's not necessary to create a new JVM for each connection. But for single JVM, it should load the JARs dynamicly as what UDR is doing. Contains SQL is trying to reuse JVM, right? As i mentioned, there is a customized idle timeout( contains non-stop) for standing by. yes, the users can modify JVM options directly, but it will have limitation on customized options like max heap size of JVM, the number of connections for one user. And also, allowed to start/restart. i'd like to do an experiment on a new command 'agent'. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1591) Need a better way to pick ports for install_local_hadoop
[ https://issues.apache.org/jira/browse/TRAFODION-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996035#comment-14996035 ] ASF GitHub Bot commented on TRAFODION-1591: --- Github user asfgit closed the pull request at: https://github.com/apache/incubator-trafodion/pull/162 > Need a better way to pick ports for install_local_hadoop > > > Key: TRAFODION-1591 > URL: https://issues.apache.org/jira/browse/TRAFODION-1591 > Project: Apache Trafodion > Issue Type: Improvement > Components: sql-general >Affects Versions: 2.0-incubating >Reporter: David Wayne Birdsall >Assignee: David Wayne Birdsall >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1604) LOB: Lob handle with a schema name longer than 8 characters crashes sqlci at malloc()
[ https://issues.apache.org/jira/browse/TRAFODION-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996016#comment-14996016 ] ASF GitHub Bot commented on TRAFODION-1604: --- GitHub user sandhyasun opened a pull request: https://github.com/apache/incubator-trafodion/pull/165 Resolving the following JIRAs related to LOB feature : TRAFODION-1604 - Fixed one place in CharType.h where 100 was the max limit for lob handle length TRAFODION-1596 - Added checks in alter code to prevent altering adding LOB columns. TRAFODION-1598 - Added several syntax filxes in parser and ExpLOBaccess.cpp to address these problems. TRAFODION-1599 - Added checks in binder to prevent sample columns from being LOB columns. TRAFODION-1602 - Added checks in DDL layer to prevent LOB columns as unique constraints or store by. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sandhyasun/incubator-trafodion lob_work_files2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-trafodion/pull/165.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #165 commit 825216d34698fd30b209e19ae0520ebd6f765fc0 Author: Sandhya Sundaresan Date: 2015-11-09T04:36:20Z Resolving the following JIRAs : TRAFODION-1604 - Fixed one place in CharType.h where 100 was the max limit for lob handle length TRAFODION-1596 - Added checks in alter code to prevent altering adding LOB columns. TRAFODION-1598 - Added several syntax filxes in parser and ExpLOBaccess.cpp to address these problems. TRAFODION-1599 - Added checks in binder to prevent sample columns from being LOB columns. TRAFODION-1602 - Added checks in DDL layer to prevent LOB columns as unique constraints or store by. > LOB: Lob handle with a schema name longer than 8 characters crashes sqlci at > malloc() > - > > Key: TRAFODION-1604 > URL: https://issues.apache.org/jira/browse/TRAFODION-1604 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.2-incubating >Reporter: Sandhya Sundaresan >Assignee: Sandhya Sundaresan > > As shown in the following example, a schema name with 8 characters ‘a1234567’ > generates a proper lob handle. But once the schema name is longer than 8 > characters, such as ‘a123456789’, The schema name in the lob handle is > truncated without the ending double quotes , and selecting the lob handle > crashes sqlci at malloc(): > LOBH0200010432353574095397060919722103915125615358218212312231160552060024"TRAFODION"."A12345678 > >>create schema a123456789; > --- SQL operation complete. > >>set schema a123456789; > --- SQL operation complete. > >>create table mytable (c blob); > --- SQL operation complete. > >>insert into mytable values (stringtolob('my string')); > --- 1 row(s) inserted. > >>select * from mytable; > C > > LOBH0200010432353574095397060919722103915125615358218212312231160552060024"TRAFODION"."A12345678 > --- 1 row(s) selected. > *** glibc detected *** sqlci: corrupted double-linked list: > 0x01d8eca0 *** > === Backtrace: = > /lib64/libc.so.6(+0x75e66)[0x7ff6252b7e66] > /lib64/libc.so.6(+0x762ed)[0x7ff6252b82ed] > /lib64/libc.so.6(+0x791c5)[0x7ff6252bb1c5] > /lib64/libc.so.6(__libc_malloc+0x71)[0x7ff6252bc751] > /usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x7ff625b2d0bd] > /usr/lib64/libstdc++.so.6(_Znam+0x9)[0x7ff625b2d1d9] > /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN9InputStmt8readStmtEP8_IO_FILEi+0x24)[0x7ff6279112c4] > /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN4Obey7processEP8SqlciEnv+0x139)[0x7ff627912529] > /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN8SqlciEnv3runEPcS0_+0xb4)[0x7ff62791b084] > sqlci[0x4019d2] > /lib64/libc.so.6(__libc_start_main+0xfd)[0x7ff625260d5d] > sqlci[0x4014b9] > === Memory map: > 0040-00403000 r-xp fd:02 35783332 > /home/trafodion/v1017ea/export/bin64/sqlci > 00602000-00603000 rw-p 2000 fd:02 35783332 > /home/trafodion/v1017ea/export/bin64/sqlci > 00a4c000-03be7000 rw-p 00:00 0 [heap] > 1000-1400 rw-s 00:04 6389765 /SYSV1000de2b (deleted) > dae0-dcb0 rw-p 00:00 0 > dcb0-e000 rw-p 00:00 0 > e000-1 rw-p 00:00 0 > 7ff5f3b99000-7ff5f3b9a000 ---p 00:00 0 > 7ff5f3b9a000-7ff5f459a000 rw-p 00:00 0 > 7ff5f459a000-7ff5f459b000 ---p 00:00 0 > 7ff5f459b000-7ff5f4f9b000 rw-p 0
[jira] [Updated] (TRAFODION-1099) LP Bug: 1437384 - sqenvcom.sh. Our CLASSPATH is too big.
[ https://issues.apache.org/jira/browse/TRAFODION-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandhya Sundaresan updated TRAFODION-1099: -- Priority: Major (was: Critical) > LP Bug: 1437384 - sqenvcom.sh.Our CLASSPATH is too big. > --- > > Key: TRAFODION-1099 > URL: https://issues.apache.org/jira/browse/TRAFODION-1099 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-general >Reporter: Guy Groulx >Assignee: Sandhya Sundaresan > Fix For: 2.0-incubating > > > sqenvcom.sh sets up CLASSPATH for trafodion. > With HDP2.2, this CLASSPATH is huge. On one of our system, echo $CLASSPATH > | wc -l return > 13000 bytes. > I believe java/Linux truncates these variables when it's too big. > Since going to HDP 2.2, we've been hit with "class not found" error > eventhough the jar is in CLASSPATH. > http://stackoverflow.com/questions/1237093/using-wildcard-for-classpath > explains that we can use wildcards in CLASSPATH to reduce it. > Rules: > Use * and not *.jar.Java assumes that * in classpath are for *.jar > When using export CLASSPATHuse quotes so that * is not expanded. EG: > export CLASSPATH=”/usr/hdp/current/hadoop-client/lib/*:${CLASSPATH}” > We need to modify our sqenvcom.sh to use wildcards instead of putting > individual jar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995948#comment-14995948 ] Kevin Xu commented on TRAFODION-1578: - For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's DCSMaster->DCSServer->UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA -> JAVA). it's not necessary to create a new JVM for each connection. But for single JVM, it should load the JARs dynamicly as what UDR is doing. Contains SQL is trying to reuse JVM, right? As i mentioned, there is a customized idle timeout( contains non-stop) for standing by. yes, the users can modify JVM options directly, but it will have limitation on customized options like max heap size of JVM, the number of connections for one user. And also, allowed to start/restart. i'd like to do an experiment on a new command 'agent'. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995562#comment-14995562 ] Hans Zeller commented on TRAFODION-1578: Some comments on two points made in the description: ??3. DCSServer start a JVM for the user. User can modify JVM options, program properties and JAVA classpath. At the same time, a monitor class will be starting in the JVM witch will register a node on Zookeeper for this JVM as well as metadata info( process id, server info and so on) and the node will be removed while JVM exiting. It allows customer to specify JVM idle time in case of some realtime senarior like Kafka consumer.?? As mentioned in the previous comment, this would work for trusted UDRs only. For isolated UDRs, we need a separate process. Also, I don't think we want the user to modify JVM options, that is the job of the DBA and would be a security problem. Third, there is only one JVM for a process, so the JVM of DCSServer would be used for the Trafodion engine as well as for trusted UDRs. The JVM gets started before a UDR gets invoked (e.g. for reading metadata), so when we invoke a UDR it's too late anyway to change the JVM parameters. Finally, we already have logic to reuse DCSServers, do we really need additional logic for that type of reuse? ??4. Useful commands on Trafci: list all JVMs in user; kill one of them that no long in use; Restart JVMs with latest JARs and so on.?? The DCS web GUI already lists all the DCSServers. Do we need another command in Trafci? Again, regular users should not be able to kill DCSServers, that's the job of a DBA. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995560#comment-14995560 ] Hans Zeller commented on TRAFODION-1578: To add a bit more info about the security issues with UDRs (User-defined Routines): There are trusted and isolated UDRs: Trusted UDRs run in the Trafodion engine and can bypass any security rules like GRANT/REVOKE. Therefore, only trusted users can write such UDRs. Isolated UDRs run in a separate process under a different user id and should not be able to break security rules. Right now, Trafodion has neither. UDRs run in a separate process, but that process runs under the Trafodion id. That will need to change and we can also implement trusted UDRs. Another issue is spoofing. We need to make sure that an attacker can't execute arbitrary code through the UDR mechanism. Our plan to solve that is by doing the following: - Require that UDR libraries reside in a directory that is controlled by Trafodion, such that users other than the Trafodion id can't directly add or modify them. - Require a special privilege to create trusted and another (lesser) privilege to create isolated libraries. Creating a library involves taking user code in form of a file in HDFS, a URL, or a local file on the server. Venkat is planning to add a GUI function to provide a client file as the code of a library. - Have one or more special user ids that execute UDRs, don't run the MXUDR process under the Trafodion id as is done today. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)