[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Hans Zeller (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996155#comment-14996155
 ] 

Hans Zeller commented on TRAFODION-1578:


The JVMs used for the Trafodion server serve mainly the Trafodion engine and 
they need to execute multiple UDRs, since we don't know which UDRs will be 
executed when we start the master executor:

1. User connecs. A JVM is reused or started for the Trafodion engine. We don't 
know yet which UDRs the user will execute.
2. User calls SPJ a. At this time the JVM is already running.
3. User calls TMUDF b. Now we have to reuse the same JVM, if we don't use an 
MXUDR server.

Are you saying that a user should specify at connect time how much heap size 
they want to have in the JVM used in their connection? We could probably do a 
limited set of options like that.

Often it's probably the creator of a UDR, not the user of it, who may want to 
specify heap size and other parameters of the JVM used to execute a UDR. End 
users may not want to have to worry about that. Having a separate JVM for the 
UDRs should help avoid conflicts between Trafodion and UDRs.

Thanks for your patience and your explanations, I hope I understand your 
proposal to use sandboxing with a SecurityManager and setting JVM parameters a 
bit better and it makes more sense to me now. I still have questions, though, 
see comment on the thread above, about the SecurityManager being too 
restrictive for many UDRs.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Hans Zeller (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996145#comment-14996145
 ] 

Hans Zeller commented on TRAFODION-1578:


Thanks for the explanation and the link, yesterday I didn't realize that the 
SecurityManager can also be used to sandbox a piece of Java code. I could 
imagine that this could work for simple functions, especially for scalar UDFs 
that usually do simple, localized operations. I don't think it would work for 
TMUDFs. It might work for SPJs that do only Trafodion SQL and other simple 
operations.

What about Java UDRs that want to access HDFS or HBase or other JDBC drivers? 
What about UDRs that call other code like Spark, Kafka or MongoDB?

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Kevin Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996143#comment-14996143
 ] 

Kevin Xu commented on TRAFODION-1578:
-

One JVM(process) per Trafodion user by default. User knows how much heap size 
they need well than us. By the way, it may set limitations on each JVM and only 
some authorized options can be modified. i'm not saying that it'll have same 
JVM with DCSServer. DCSServer will create a JVM with a customized IDLE 
timeout(user may make decision that how long it is existing, maybe never stop). 
As i said, the new path will be DCSMaster->DCSServer->create new JVM for 
current user if not exist. Nothing about UDR.

Yes, regular users dont have permission to kill DCSServers. They have 
permission to start/stop SPJ JVM( one for each user by default).

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Kevin Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996115#comment-14996115
 ] 

Kevin Xu commented on TRAFODION-1578:
-

For authority, the checking should be done in DCSServer( check whether current 
user has privilege to call) by querying metadata tables. One JVM(process) per 
user by default. 
For arbitrary code, JVM SecurityManager should be helpful. 
  1. FilePermission in SecurityManager should be helpful.
  2. It should be solved if started JVM only has specified folder priviledge. 
Eventually, the JARs should be existing in the running node and it's fine to 
copy from HDFS at the first time. It's very good to hear that Venkat will add a 
GUI function.
  3. If JVM behaves are under control, it's not necessary to run a process with 
a special user ids(one user id per connection? if not, it will impact each 
other. Otherwise, you will have many user ids to be created.)
  
Refs: https://en.wikipedia.org/wiki/Java_security#Security_manager
  
http://docs.oracle.com/javase/7/docs/technotes/guides/security/overview/jsoverview.html
 

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Kevin Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995948#comment-14995948
 ] 

Kevin Xu edited comment on TRAFODION-1578 at 11/9/15 5:04 AM:
--

For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's 
DCSMaster->DCSServer->SQ Engine -> UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA 
-> JAVA). it's not necessary to create a new JVM for each connection. But for 
single JVM, it should load the JARs dynamicly as what UDR is doing and has 
security issue. So the proposal is to start a new JVM and manage its security 
by Java SecurityMangement. Contains SQL is trying to reuse JVM, right? As i 
mentioned, there is a customized idle timeout( contains non-stop) for standing 
by. 
yes, the users can modify JVM options directly, but it will have limitation on 
customized options like max heap size of JVM, the number of connections for one 
user. And also, allowed to start/restart. i'd like to do an experiment on a new 
command 'agent'.


was (Author: kevinxu021):
For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's 
DCSMaster->DCSServer->UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA -> JAVA). 
it's not necessary to create a new JVM for each connection. But for single JVM, 
it should load the JARs dynamicly as what UDR is doing. Contains SQL is trying 
to reuse JVM, right? As i mentioned, there is a customized idle timeout( 
contains non-stop) for standing by. 
yes, the users can modify JVM options directly, but it will have limitation on 
customized options like max heap size of JVM, the number of connections for one 
user. And also, allowed to start/restart. i'd like to do an experiment on a new 
command 'agent'.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1591) Need a better way to pick ports for install_local_hadoop

2015-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996035#comment-14996035
 ] 

ASF GitHub Bot commented on TRAFODION-1591:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/162


> Need a better way to pick ports for install_local_hadoop
> 
>
> Key: TRAFODION-1591
> URL: https://issues.apache.org/jira/browse/TRAFODION-1591
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Affects Versions: 2.0-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1604) LOB: Lob handle with a schema name longer than 8 characters crashes sqlci at malloc()

2015-11-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996016#comment-14996016
 ] 

ASF GitHub Bot commented on TRAFODION-1604:
---

GitHub user sandhyasun opened a pull request:

https://github.com/apache/incubator-trafodion/pull/165

Resolving the following JIRAs related to LOB feature :

TRAFODION-1604 - Fixed one place in CharType.h where 100 was the max limit 
for lob handle length
TRAFODION-1596 - Added checks in alter code to prevent altering adding LOB 
columns.
TRAFODION-1598 - Added several syntax filxes in parser and ExpLOBaccess.cpp 
to address these problems.
TRAFODION-1599 - Added checks in binder to prevent sample columns from 
being LOB columns.
TRAFODION-1602 - Added checks in DDL layer to prevent LOB columns as unique 
constraints or store by.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sandhyasun/incubator-trafodion lob_work_files2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/165.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #165


commit 825216d34698fd30b209e19ae0520ebd6f765fc0
Author: Sandhya Sundaresan 
Date:   2015-11-09T04:36:20Z

Resolving the following JIRAs :
TRAFODION-1604 - Fixed one place in CharType.h where 100 was the max limit 
for lob handle length
TRAFODION-1596 - Added checks in alter code to prevent altering adding LOB 
columns.
TRAFODION-1598 - Added several syntax filxes in parser and ExpLOBaccess.cpp 
to address these problems.
TRAFODION-1599 - Added checks in binder to prevent sample columns from 
being LOB columns.
TRAFODION-1602 - Added checks in DDL layer to prevent LOB columns as unique 
constraints or store by.




> LOB: Lob handle with a schema name longer than 8 characters crashes sqlci at 
> malloc()
> -
>
> Key: TRAFODION-1604
> URL: https://issues.apache.org/jira/browse/TRAFODION-1604
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-exe
>Affects Versions: 1.2-incubating
>Reporter: Sandhya Sundaresan
>Assignee: Sandhya Sundaresan
>
> As shown in the following example, a schema name with 8 characters ‘a1234567’ 
> generates a proper lob handle. But once the schema name is longer than 8 
> characters, such as ‘a123456789’, The schema name in the lob handle is 
> truncated without the ending double quotes , and selecting the lob handle 
> crashes sqlci at malloc():
> LOBH0200010432353574095397060919722103915125615358218212312231160552060024"TRAFODION"."A12345678
> >>create schema a123456789;
> --- SQL operation complete.
> >>set schema a123456789;
> --- SQL operation complete.
> >>create table mytable (c blob);
> --- SQL operation complete.
> >>insert into mytable values (stringtolob('my string'));
> --- 1 row(s) inserted.
> >>select * from mytable;
> C
> 
> LOBH0200010432353574095397060919722103915125615358218212312231160552060024"TRAFODION"."A12345678
> --- 1 row(s) selected.
> *** glibc detected *** sqlci: corrupted double-linked list: 
> 0x01d8eca0 ***
> === Backtrace: =
> /lib64/libc.so.6(+0x75e66)[0x7ff6252b7e66]
> /lib64/libc.so.6(+0x762ed)[0x7ff6252b82ed]
> /lib64/libc.so.6(+0x791c5)[0x7ff6252bb1c5]
> /lib64/libc.so.6(__libc_malloc+0x71)[0x7ff6252bc751]
> /usr/lib64/libstdc++.so.6(_Znwm+0x1d)[0x7ff625b2d0bd]
> /usr/lib64/libstdc++.so.6(_Znam+0x9)[0x7ff625b2d1d9]
> /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN9InputStmt8readStmtEP8_IO_FILEi+0x24)[0x7ff6279112c4]
> /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN4Obey7processEP8SqlciEnv+0x139)[0x7ff627912529]
> /home/trafodion/v1017ea/export/lib64/libsqlcilib.so(_ZN8SqlciEnv3runEPcS0_+0xb4)[0x7ff62791b084]
> sqlci[0x4019d2]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7ff625260d5d]
> sqlci[0x4014b9]
> === Memory map: 
> 0040-00403000 r-xp  fd:02 35783332 
> /home/trafodion/v1017ea/export/bin64/sqlci
> 00602000-00603000 rw-p 2000 fd:02 35783332 
> /home/trafodion/v1017ea/export/bin64/sqlci
> 00a4c000-03be7000 rw-p  00:00 0 [heap]
> 1000-1400 rw-s  00:04 6389765 /SYSV1000de2b (deleted)
> dae0-dcb0 rw-p  00:00 0
> dcb0-e000 rw-p  00:00 0
> e000-1 rw-p  00:00 0
> 7ff5f3b99000-7ff5f3b9a000 ---p  00:00 0
> 7ff5f3b9a000-7ff5f459a000 rw-p  00:00 0
> 7ff5f459a000-7ff5f459b000 ---p  00:00 0
> 7ff5f459b000-7ff5f4f9b000 rw-p 0

[jira] [Updated] (TRAFODION-1099) LP Bug: 1437384 - sqenvcom.sh. Our CLASSPATH is too big.

2015-11-08 Thread Sandhya Sundaresan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandhya Sundaresan updated TRAFODION-1099:
--
Priority: Major  (was: Critical)

> LP Bug: 1437384 - sqenvcom.sh.Our CLASSPATH is too big.
> ---
>
> Key: TRAFODION-1099
> URL: https://issues.apache.org/jira/browse/TRAFODION-1099
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-general
>Reporter: Guy Groulx
>Assignee: Sandhya Sundaresan
> Fix For: 2.0-incubating
>
>
> sqenvcom.sh sets up CLASSPATH for trafodion.
> With HDP2.2, this CLASSPATH is huge.   On one of our system, echo $CLASSPATH 
> | wc -l return > 13000 bytes.
> I believe java/Linux truncates these variables when it's too big.
> Since going to HDP 2.2, we've been hit with "class not found" error 
> eventhough the jar is in CLASSPATH.
> http://stackoverflow.com/questions/1237093/using-wildcard-for-classpath 
> explains that we can use wildcards in CLASSPATH to reduce it.
> Rules:
> Use * and not *.jar.Java assumes that * in classpath are for *.jar
> When using export CLASSPATHuse quotes so that * is not expanded.   EG:
> export CLASSPATH=”/usr/hdp/current/hadoop-client/lib/*:${CLASSPATH}”
> We need to modify our sqenvcom.sh to use wildcards instead of putting 
> individual jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Kevin Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995948#comment-14995948
 ] 

Kevin Xu commented on TRAFODION-1578:
-

For muti-thread DCS, it's DCSMaster->DCSServer->jdbcT2. For SPJ, it's 
DCSMaster->DCSServer->UDR->jdbcT2(JAVA-> C++ -> JAVA, why not JAVA -> JAVA). 
it's not necessary to create a new JVM for each connection. But for single JVM, 
it should load the JARs dynamicly as what UDR is doing. Contains SQL is trying 
to reuse JVM, right? As i mentioned, there is a customized idle timeout( 
contains non-stop) for standing by. 
yes, the users can modify JVM options directly, but it will have limitation on 
customized options like max heap size of JVM, the number of connections for one 
user. And also, allowed to start/restart. i'd like to do an experiment on a new 
command 'agent'.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Hans Zeller (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995562#comment-14995562
 ] 

Hans Zeller commented on TRAFODION-1578:


Some comments on two points made in the description:

??3. DCSServer start a JVM for the user. User can modify JVM options, program 
properties and JAVA classpath. At the same time, a monitor class will be 
starting in the JVM witch will register a node on Zookeeper for this JVM as 
well as metadata info( process id, server info and so on) and the node will be 
removed while JVM exiting. It allows customer to specify JVM idle time in case 
of some realtime senarior like Kafka consumer.??

As mentioned in the previous comment, this would work for trusted UDRs only. 
For isolated UDRs, we need a separate process. Also, I don't think we want the 
user to modify JVM options, that is the job of the DBA and would be a security 
problem. Third, there is only one JVM for a process, so the JVM of DCSServer 
would be used for the Trafodion engine as well as for trusted UDRs. The JVM 
gets started before a UDR gets invoked (e.g. for reading metadata), so when we 
invoke a UDR it's too late anyway to change the JVM parameters. Finally, we 
already have logic to reuse DCSServers, do we really need additional logic for 
that type of reuse?

??4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
long in use; Restart JVMs with latest JARs and so on.??

The DCS web GUI already lists all the DCSServers. Do we need another command in 
Trafci? Again, regular users should not be able to kill DCSServers, that's the 
job of a DBA.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-08 Thread Hans Zeller (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995560#comment-14995560
 ] 

Hans Zeller commented on TRAFODION-1578:


To add a bit more info about the security issues with UDRs (User-defined 
Routines):

There are trusted and isolated UDRs: Trusted UDRs run in the Trafodion engine 
and can bypass any security rules like GRANT/REVOKE. Therefore, only trusted 
users can write such UDRs. Isolated UDRs run in a separate process under a 
different user id and should not be able to break security rules.

Right now, Trafodion has neither. UDRs run in a separate process, but that 
process runs under the Trafodion id. That will need to change and we can also 
implement trusted UDRs.

Another issue is spoofing. We need to make sure that an attacker can't execute 
arbitrary code through the UDR mechanism. Our plan to solve that is by doing 
the following:

- Require that UDR libraries reside in a directory that is controlled by 
Trafodion, such that users other than the Trafodion id can't directly add or 
modify them.
- Require a special privilege to create trusted and another (lesser) privilege 
to create isolated libraries. Creating a library involves taking user code in 
form of a file in HDFS, a URL, or a local file on the server. Venkat is 
planning to add a GUI function to provide a client file as the code of a 
library.
- Have one or more special user ids that execute UDRs, don't run the MXUDR 
process under the Trafodion id as is done today.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)