[jira] [Created] (TRAFODION-1578) Proposal for SPJ management

2015-11-03 Thread Kevin Xu (JIRA)
Kevin Xu created TRAFODION-1578:
---

 Summary: Proposal for SPJ management
 Key: TRAFODION-1578
 URL: https://issues.apache.org/jira/browse/TRAFODION-1578
 Project: Apache Trafodion
  Issue Type: Improvement
  Components: connectivity-dcs
Reporter: Kevin Xu


JAR upload process:
1. Initialize JAR upload procedure by default
2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and create 
library will be done here. And also, you can only upload the JARs by UPLOAD 
command on Trafci that it will not create a lib.
   Tips: Before put the JAR into HDFS check MD5 first, if the file exists, only 
add a record in metadata table in case users upload the same JAR many times on 
platform.
3. On server-side, the JAR will store in HDFS. At the same time JAR 
metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
metadata table.
4. create procedure is the same as now.

JAR perform process:
1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
2. DCSMaster assign a DCSServer for the CALL.
3. DCSServer start a JVM for the user. User can modify JVM options, program 
properties and JAVA classpath. At the same time, a monitor class will be 
starting in the JVM witch will register a node on Zookeeper for this JVM as 
well as metadata info( process id, server info and so on) and the node will be 
removed while JVM exiting. It allows customer to specify JVM idle time in case 
of some realtime senarior like Kafka consumer. 
4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1579) Clean up test scripts

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987785#comment-14987785
 ] 

ASF GitHub Bot commented on TRAFODION-1579:
---

GitHub user hegdean opened a pull request:

https://github.com/apache/incubator-trafodion/pull/159

[TRAFODION-1579] Clean up test scripts



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hegdean/incubator-trafodion wrk-brnch

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/159.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #159


commit 0ce8b15c094a1e43e13eec67e6d4f35c898533f8
Author: Anuradha Hegde 
Date:   2015-11-03T17:53:50Z

Cleanup the scripts to remove HP names

commit 3bc1a7f1c43096c27c0317a26847ecc6b6be8bbe
Author: Anuradha Hegde 
Date:   2015-11-03T17:58:22Z

Merge branch 'master' of github.com:apache/incubator-trafodion into 
wrk-brnch

Conflicts:
core/sqf/sql/scripts/install_traf_components




> Clean up test scripts
> -
>
> Key: TRAFODION-1579
> URL: https://issues.apache.org/jira/browse/TRAFODION-1579
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: connectivity-mxosrvr
>Reporter: Anuradha Hegde
>
> The ODBC test templates have HP names for datasource and default service. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-03 Thread Venkat Muthuswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987928#comment-14987928
 ] 

Venkat Muthuswamy commented on TRAFODION-1578:
--

1. I suggest that we keep the interface generic not just for JARs. It should 
allow to deploy UDF files too.

2. Storing SPJs in user specific directories can be a problem. You cannot share 
it with other users. Often times, a SPJ developer or admin will deploy the SPJs 
and other users/application will call the SPJ.  So limiting access to the user 
who deployed it will not work. The security concerns can be addressed using the 
SQL privileges on the Library object which encapsulates the jar or udf file. We 
had a similar design in the pre-apache product and we had to revert to using a 
single directory or store because it became too cumbersome to manage between 
roles and users and public.

3. We should keep in mind the ODBC buffer limits and the size of the jar. The 
client (TrafCI) here will have to send the jar contents in chunks and the SPJ 
has to assemble them into the single LOB or HDFS file.

4. I have a security concern about the additional commands being proposed: 
"list all JVMs in user; kill one of them that no long in use; Restart JVMs with 
latest JARs and so on". How will you enforce security here. Consider leveraging 
and extending the Trafodion REST server for these kind of manageability 
functions. These features do not seem to belong in TrafCI.

5. Make sure the proposed interfaces prevent any Denial of Service attacks and 
adds necessary checks to limit size of the file etc..

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1579) Clean up test scripts

2015-11-03 Thread Anuradha Hegde (JIRA)
Anuradha Hegde created TRAFODION-1579:
-

 Summary: Clean up test scripts
 Key: TRAFODION-1579
 URL: https://issues.apache.org/jira/browse/TRAFODION-1579
 Project: Apache Trafodion
  Issue Type: Bug
  Components: connectivity-mxosrvr
Reporter: Anuradha Hegde


The ODBC test templates have HP names for datasource and default service. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1579) Clean up test scripts

2015-11-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988391#comment-14988391
 ] 

ASF GitHub Bot commented on TRAFODION-1579:
---

Github user hegdean closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/159


> Clean up test scripts
> -
>
> Key: TRAFODION-1579
> URL: https://issues.apache.org/jira/browse/TRAFODION-1579
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: connectivity-mxosrvr
>Reporter: Anuradha Hegde
>
> The ODBC test templates have HP names for datasource and default service. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs

2015-11-03 Thread Hans Zeller (JIRA)
Hans Zeller created TRAFODION-1580:
--

 Summary: Integration between Trafodion and Drill, Spark, Kafka, 
etc., using TMUDFs
 Key: TRAFODION-1580
 URL: https://issues.apache.org/jira/browse/TRAFODION-1580
 Project: Apache Trafodion
  Issue Type: New Feature
  Components: sql-general
Affects Versions: 1.3-incubating
Reporter: Hans Zeller


This is a JIRA for multiple subtasks related to data integration with other 
Apache projects that can function as data sources or data sinks for Trafodion. 
This JIRA is specific to work that utilizes TMUDFs (Table-Mapping UDFs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TRAFODION-1582) Install Apache drill as an optional add-on in install_local_hadoop

2015-11-03 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TRAFODION-1582 started by Hans Zeller.
--
> Install Apache drill as an optional add-on in install_local_hadoop
> --
>
> Key: TRAFODION-1582
> URL: https://issues.apache.org/jira/browse/TRAFODION-1582
> Project: Apache Trafodion
>  Issue Type: Sub-task
>  Components: sql-general
>Affects Versions: 1.3-incubating
>Reporter: Hans Zeller
>Assignee: Hans Zeller
> Fix For: 2.0-incubating
>
>
> Add a new script, install_local_drill, to download and install Apache Drill, 
> similar to how we install Hadoop and HBase.
> For Drill this is might be relatively easy, just download the tar file and 
> extract it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output

2015-11-03 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TRAFODION-1581 started by Hans Zeller.
--
> Add a TMUDF that can return a JDBC result set as table-valued output
> 
>
> Key: TRAFODION-1581
> URL: https://issues.apache.org/jira/browse/TRAFODION-1581
> Project: Apache Trafodion
>  Issue Type: Sub-task
>  Components: sql-general
>Affects Versions: 1.3-incubating
>Reporter: Hans Zeller
>Assignee: Hans Zeller
> Fix For: 2.0-incubating
>
>
> One way to read data from other data sources would be a Trafodion TMUDF that 
> takes a connection string, an SQL statement and other necessary info as an 
> input, connects to a JDBC data source, prepares the statement, and returns 
> the result set as a table-valued output. This would enable a basic connector 
> for many data sources, including Spark, Drill and Kafka.
> Specifically, I would like to add a "predefined" TMUDF to Trafodion that 
> takes the following parameters:
> 1. The name of a jar with a JDBC driver.
> 2. A connection string to use
> 3. The class name of the driver
> 4. A user id
> 5. A password
> 6. The type of processing to do (right now only one type is supported)
> 7. Info depending on the type.
> The first type of processing I would like to add is "source", and it does the 
> following: It accepts a list of SQL statements to execute. Only one of these 
> statements can return a result set. The data in the result set will be 
> returned as table-valued output.
> Future processing types could do a parallel select like ODB does or they 
> could insert into a table on the system identified by the JDBC driver info.
> All parameters need to be compile-time constants, so that the UDF can connect 
> to the data source at compile time and prepare the statement. Based on the 
> prepared statement, it will determine number, names and SQL types of the 
> column(s) of the table-valued result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs

2015-11-03 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TRAFODION-1580 started by Hans Zeller.
--
> Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
> -
>
> Key: TRAFODION-1580
> URL: https://issues.apache.org/jira/browse/TRAFODION-1580
> Project: Apache Trafodion
>  Issue Type: New Feature
>  Components: sql-general
>Affects Versions: 1.3-incubating
>Reporter: Hans Zeller
>Assignee: Hans Zeller
>
> This is a JIRA for multiple subtasks related to data integration with other 
> Apache projects that can function as data sources or data sinks for 
> Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping 
> UDFs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1583) Install Apache Spark as an optional add-on in install_local_hadoop

2015-11-03 Thread Hans Zeller (JIRA)
Hans Zeller created TRAFODION-1583:
--

 Summary: Install Apache Spark as an optional add-on in 
install_local_hadoop
 Key: TRAFODION-1583
 URL: https://issues.apache.org/jira/browse/TRAFODION-1583
 Project: Apache Trafodion
  Issue Type: Sub-task
Affects Versions: 1.3-incubating
Reporter: Hans Zeller
 Fix For: 2.0-incubating


Optionally install a local instance of Spark, so that we can use it to test 
integration between Trafodion and Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1581) Add a TMUDF that can return a JDBC result set as table-valued output

2015-11-03 Thread Hans Zeller (JIRA)
Hans Zeller created TRAFODION-1581:
--

 Summary: Add a TMUDF that can return a JDBC result set as 
table-valued output
 Key: TRAFODION-1581
 URL: https://issues.apache.org/jira/browse/TRAFODION-1581
 Project: Apache Trafodion
  Issue Type: Sub-task
Affects Versions: 1.3-incubating
Reporter: Hans Zeller
Assignee: Hans Zeller
 Fix For: 2.0-incubating


One way to read data from other data sources would be a Trafodion TMUDF that 
takes a connection string, an SQL statement and other necessary info as an 
input, connects to a JDBC data source, prepares the statement, and returns the 
result set as a table-valued output. This would enable a basic connector for 
many data sources, including Spark, Drill and Kafka.

Specifically, I would like to add a "predefined" TMUDF to Trafodion that takes 
the following parameters:

1. The name of a jar with a JDBC driver.
2. A connection string to use
3. The class name of the driver
4. A user id
5. A password
6. The type of processing to do (right now only one type is supported)
7. Info depending on the type.

The first type of processing I would like to add is "source", and it does the 
following: It accepts a list of SQL statements to execute. Only one of these 
statements can return a result set. The data in the result set will be returned 
as table-valued output.

Future processing types could do a parallel select like ODB does or they could 
insert into a table on the system identified by the JDBC driver info.

All parameters need to be compile-time constants, so that the UDF can connect 
to the data source at compile time and prepare the statement. Based on the 
prepared statement, it will determine number, names and SQL types of the 
column(s) of the table-valued result.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1584) Install Apache Kafka as an optional add-on in install_local_hadoop

2015-11-03 Thread Hans Zeller (JIRA)
Hans Zeller created TRAFODION-1584:
--

 Summary: Install Apache Kafka as an optional add-on in 
install_local_hadoop
 Key: TRAFODION-1584
 URL: https://issues.apache.org/jira/browse/TRAFODION-1584
 Project: Apache Trafodion
  Issue Type: Sub-task
  Components: sql-general
Affects Versions: 1.3-incubating
Reporter: Hans Zeller
 Fix For: 2.0-incubating


Optionally install Apache Kafka so that we can test integration between Kafka 
and Trafodion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work stopped] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs

2015-11-03 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TRAFODION-1580 stopped by Hans Zeller.
--
> Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
> -
>
> Key: TRAFODION-1580
> URL: https://issues.apache.org/jira/browse/TRAFODION-1580
> Project: Apache Trafodion
>  Issue Type: New Feature
>  Components: sql-general
>Affects Versions: 1.3-incubating
>Reporter: Hans Zeller
>Assignee: Hans Zeller
>
> This is a JIRA for multiple subtasks related to data integration with other 
> Apache projects that can function as data sources or data sinks for 
> Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping 
> UDFs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (TRAFODION-1580) Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs

2015-11-03 Thread Hans Zeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on TRAFODION-1580 started by Hans Zeller.
--
> Integration between Trafodion and Drill, Spark, Kafka, etc., using TMUDFs
> -
>
> Key: TRAFODION-1580
> URL: https://issues.apache.org/jira/browse/TRAFODION-1580
> Project: Apache Trafodion
>  Issue Type: New Feature
>  Components: sql-general
>Affects Versions: 1.3-incubating
>Reporter: Hans Zeller
>Assignee: Hans Zeller
>
> This is a JIRA for multiple subtasks related to data integration with other 
> Apache projects that can function as data sources or data sinks for 
> Trafodion. This JIRA is specific to work that utilizes TMUDFs (Table-Mapping 
> UDFs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-03 Thread liu ming (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987369#comment-14987369
 ] 

liu ming commented on TRAFODION-1578:
-

I like this idea very much. It allows user to deploy SPJ via a simple trafci 
command, since itself is a SPJ, so it is also very easy to integrated into a 
GUI manager.

Need to discuss a little more:
1.  Put DLL/jar as LOB or as HDFS file?
   HDFS file seems very good to me.
2.  Location of DLL/jar at server side. 
   $MY_SQROOT/export/lib/udr/cache is a good place, and as Kevin proposed, 
there are benefit to put DLL/jar files in directories per user. So we can have 
finer control, user A cannot call user B’s SPJ for example. So maybe we can 
upload the jar to $MY_SQROOT/export/lib/udr/cache/$USER

If the SPJ/UDR need extra DLL/jar files, we can  simply put into 
$MY_SQROOT/export/lib along with the SPJ jar, $MY_SQROOT/export/lib is already 
in CLASSPATH and  LD_LIBRARY_PATH. Or we can ask the developer of UDR/SPJ to 
pack all required DLL/jar into a single file. 

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)