[jira] [Created] (TRAFODION-1586) Add support to create an external Trafodion table and map it to a native HBase table

2015-11-04 Thread Anoop Sharma (JIRA)
Anoop Sharma created TRAFODION-1586:
---

 Summary: Add support to create an external Trafodion table and map 
it to a native HBase table
 Key: TRAFODION-1586
 URL: https://issues.apache.org/jira/browse/TRAFODION-1586
 Project: Apache Trafodion
  Issue Type: New Feature
Reporter: Anoop Sharma
Assignee: Anoop Sharma
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1587) Update of primary key on table with index when set clause has subquery gives wrong result

2015-11-04 Thread Suresh Subbiah (JIRA)
Suresh Subbiah created TRAFODION-1587:
-

 Summary: Update of primary key on table with index when set clause 
has subquery gives wrong result
 Key: TRAFODION-1587
 URL: https://issues.apache.org/jira/browse/TRAFODION-1587
 Project: Apache Trafodion
  Issue Type: Bug
  Components: sql-cmp
Affects Versions: 1.2-incubating
Reporter: Suresh Subbiah
Assignee: Suresh Subbiah


Updating primary key of a table with
a) an index
b) using a self-referencing subquery in set clause

gives wrong result as shown below, sometimes.

Problem found by Selva and analyzed by Dave Birdsall and Selva.

The Hash join in plan causes the delete to occur before subquery is evaluated, 
even though subquery scan is early in the the plan (node 1).
A fix will attempt to change the hash join to a tsj .

set schema mytest;
create schema mytest;
create table mytable (c1 char(1), c2 integer not null primary key);
CREATE INDEX MYTABLE_IDX ON MYTABLE(C1 ASC);
insert into mytable values ('A', 100), ('B', 200), ('C', 300);
select * from mytable order by 1;
prepare xx from update mytable set c2 = 
(select c from (select count(distinct c2) from mytable where c1 = 'A') dt(c))
where c2 = 100 ;
explain options 'f' xx ;
execute xx ;

>>explain options 'f' xx ;

LC   RC   OP   OPERATOR  OPT   DESCRIPTION   CARD
         -

12   .13   rootx 1.00E+000
10   11   12   nested_join   1.00E+000
..11   trafodion_insertMYTABLE_IDX   1.00E+000
8910   nested_join   1.00E+000
..9trafodion_insertMYTABLE   1.00E+000
7.8sort  1.00E+000
637hybrid_hash_join  1.00E+000
456nested_anti_semi_joi  1.00E+000
..5trafodion_deleteMYTABLE_IDX   1.00E+000
..4trafodion_deleteMYTABLE   1.00E+000
2.3sort_scalar_aggr  1.00E+000
1.2sort_scalar_aggr  1.00E+000
..1trafodion_index_scanMYTABLE_IDX   1.00E+001


>>select * from mytable ;

C1  C2 
--  ---

A 0
B   200
C   300





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TRAFODION-1585) MDAM plan is not chosen unless NJ is turned off

2015-11-04 Thread David Wayne Birdsall (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wayne Birdsall reassigned TRAFODION-1585:
---

Assignee: David Wayne Birdsall

> MDAM plan is not chosen unless NJ is turned off
> ---
>
> Key: TRAFODION-1585
> URL: https://issues.apache.org/jira/browse/TRAFODION-1585
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Reporter: Qifan Chen
>Assignee: David Wayne Birdsall
>
> MDAM is used in scan node #8 in the following plan.
> >explain options 'f' xx;
> LC   RC   OP   OPERATOR  OPT   DESCRIPTION   CARD
>          -
> 15   .16   root  1.00E+000
> 14   715   hybrid_hash_join  1.00E+000
> 13   .14   hash_partial_groupby  1.00E+000
> 12   .13   esp_exchange1:21(hash2)   1.00E+000
> 11   .12   sort_partial_groupby  1.00E+000
> 10   .11   hash_groupby  6.60E+004
> 9.10   hash_groupby  1.35E+005
> 8.9esp_exchange21(hash2):16(hash2)   1.40E+006
> ..8trafodion_scan   OP   1.40E+006
> 6.7sort_groupby  1.00E+000
> 5.6hash_partial_groupby  3.09E+001
> 4.5esp_exchange1:21(hash2)   3.09E+001
> 3.4hash_partial_groupby  3.09E+001
> 2.3hash_groupby  1.35E+005
> 1.2esp_exchange21(hash2):16(hash2)   1.40E+006
> ..1trafodion_scan   OP   1.40E+006
> The same MDAM scan is absent if CQD NESTED_JOINS 'off' is not used. 
> TRAFODION_SCAN   SEQ_NO 8NO CHILDREN
> TABLE_NAME ...  P
> REQUESTS_IN .. 1
> ROWS_OUT . 1,401,802
> EST_OPER_COST  0.21
> EST_TOTAL_COST ... 0.21
> DESCRIPTION
>   max_card_est ... 1.4018e+06
>   fragment_id  5
>   parent_frag  4
>   fragment_type .. esp
>   scan_type .. subset scan limited by mdam of table
>   OP
>   object_type  Trafodion
>   cache_size  10,000
>   probes . 1
>   rows_accessed .. 1.4018e+06
>   key_columns  _SALT_, SITE_ID, PANEL, DWD_ID
>   executor_predicates  (PANEL = '1') and (SITE_ID = 450)
>   mdam_disjunct .. (PANEL = '1') and (SITE_ID = 450) and (_SALT_
>  >= (\:_sys_HostVarLoHashPart Hash2Distrib 64)) 
> and
>  (_SALT_ <= (\:_sys_HostVarHiHashPart Hash2Distrib
>  64))
>   part_key_predicates  (PANEL = '1' = PANEL) and (SITE_ID = 450
>  = SITE_ID)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-1585) MDAM plan is not chosen unless NJ is turned off

2015-11-04 Thread Qifan Chen (JIRA)
Qifan Chen created TRAFODION-1585:
-

 Summary: MDAM plan is not chosen unless NJ is turned off
 Key: TRAFODION-1585
 URL: https://issues.apache.org/jira/browse/TRAFODION-1585
 Project: Apache Trafodion
  Issue Type: Bug
  Components: sql-cmp
Reporter: Qifan Chen


MDAM is used in scan node #8 in the following plan.

>explain options 'f' xx;

LC   RC   OP   OPERATOR  OPT   DESCRIPTION   CARD
         -

15   .16   root  1.00E+000
14   715   hybrid_hash_join  1.00E+000
13   .14   hash_partial_groupby  1.00E+000
12   .13   esp_exchange1:21(hash2)   1.00E+000
11   .12   sort_partial_groupby  1.00E+000
10   .11   hash_groupby  6.60E+004
9.10   hash_groupby  1.35E+005
8.9esp_exchange21(hash2):16(hash2)   1.40E+006
..8trafodion_scan   OP   1.40E+006
6.7sort_groupby  1.00E+000
5.6hash_partial_groupby  3.09E+001
4.5esp_exchange1:21(hash2)   3.09E+001
3.4hash_partial_groupby  3.09E+001
2.3hash_groupby  1.35E+005
1.2esp_exchange21(hash2):16(hash2)   1.40E+006
..1trafodion_scan   OP   1.40E+006

The same MDAM scan is absent if CQD NESTED_JOINS 'off' is not used. 

TRAFODION_SCAN   SEQ_NO 8NO CHILDREN
TABLE_NAME ...  P
REQUESTS_IN .. 1
ROWS_OUT . 1,401,802
EST_OPER_COST  0.21
EST_TOTAL_COST ... 0.21
DESCRIPTION
  max_card_est ... 1.4018e+06
  fragment_id  5
  parent_frag  4
  fragment_type .. esp
  scan_type .. subset scan limited by mdam of table
  OP
  object_type  Trafodion
  cache_size  10,000
  probes . 1
  rows_accessed .. 1.4018e+06
  key_columns  _SALT_, SITE_ID, PANEL, DWD_ID
  executor_predicates  (PANEL = '1') and (SITE_ID = 450)
  mdam_disjunct .. (PANEL = '1') and (SITE_ID = 450) and (_SALT_
 >= (\:_sys_HostVarLoHashPart Hash2Distrib 64)) and
 (_SALT_ <= (\:_sys_HostVarHiHashPart Hash2Distrib
 64))
  part_key_predicates  (PANEL = '1' = PANEL) and (SITE_ID = 450
 = SITE_ID)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1576) Performance improvement and reducing offline interval for backup

2015-11-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990209#comment-14990209
 ] 

ASF GitHub Bot commented on TRAFODION-1576:
---

Github user selvaganesang commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/158#discussion_r43925129
  
--- Diff: 
core/sqf/hbase_utilities/src/main/java/org/trafodion/utility/backuprestore/TrafExportSnapshot.java
 ---
@@ -0,0 +1,1076 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.trafodion.utility.backuprestore;
+
+import java.io.BufferedInputStream;
+import java.io.FileNotFoundException;
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.io.InputStream;
+import java.net.URI;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Random;
+
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileChecksum;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileUtil;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.HRegionInfo;
+import org.apache.hadoop.hbase.io.FileLink;
+import org.apache.hadoop.hbase.io.HFileLink;
+import org.apache.hadoop.hbase.io.HLogLink;
+import org.apache.hadoop.hbase.io.hadoopbackport.ThrottledInputStream;
+import org.apache.hadoop.hbase.mapreduce.JobUtil;
+import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
+import 
org.apache.hadoop.hbase.protobuf.generated.HBaseProtos.SnapshotDescription;
+import 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos.SnapshotFileInfo;
+import 
org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos.SnapshotRegionManifest;
+import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
+import org.apache.hadoop.hbase.util.FSUtils;
+import org.apache.hadoop.hbase.snapshot.*;
+import org.apache.hadoop.hbase.util.Pair;
+import org.apache.hadoop.io.BytesWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.SequenceFile;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.JobContext;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.InputFormat;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.RecordReader;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat;
+import org.apache.hadoop.mapreduce.security.TokenCache;
+import org.apache.hadoop.util.StringUtils;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+/**
+ * Export the specified snapshot to a given FileSystem.
+ *
+ * The .snapshot/name folder is copied to the destination cluster
+ * and then all the hfiles/hlogs are copied using a Map-Reduce Job in the 
.archive/ location.
+ * When everything is done, the second cluster can restore the snapshot.
+ */
+@InterfaceAudience.Public
+@InterfaceStability.Evolving
+public class 

[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management

2015-11-04 Thread Suresh Subbiah (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990060#comment-14990060
 ] 

Suresh Subbiah commented on TRAFODION-1578:
---

The JAR perform process is slightly different for CALL statements as described 
above.

Currently what we have is DcsMaster->DcsServer->mxosrvr (these three 
associations occur when the trafci, JDBC (T4), ODBC  app connects)
When the first CALL statement is made from this connection an MXUDR process is 
associated with mxosrvr. mxosrvr acts as master executor for the CALL 
statement. MXUDR process will host the jvm that will execute the SPJ. If the 
SPJ contains SQL, then the mxudr process itself acts as a master executor (with 
JDBC T2). Mxudr processes can be reused between different statements in the 
same connection. In this case the same JVM will be used.

In general JVM management (for those used by SQL queries) is done by the 
Trafodion engine. Is the proposal here suggesting that users may want to 
control that more directly?

Thank you.

> Proposal for SPJ management
> ---
>
> Key: TRAFODION-1578
> URL: https://issues.apache.org/jira/browse/TRAFODION-1578
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: connectivity-dcs
>Reporter: Kevin Xu
>
> JAR upload process:
> 1. Initialize JAR upload procedure by default
> 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and 
> create library will be done here. And also, you can only upload the JARs by 
> UPLOAD command on Trafci that it will not create a lib.
>Tips: Before put the JAR into HDFS check MD5 first, if the file exists, 
> only add a record in metadata table in case users upload the same JAR many 
> times on platform.
> 3. On server-side, the JAR will store in HDFS. At the same time JAR 
> metadata(path in HDFS, MD5 of the file, and others) stores in store procedure 
> metadata table.
> 4. create procedure is the same as now.
> JAR perform process:
> 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET.
> 2. DCSMaster assign a DCSServer for the CALL.
> 3. DCSServer start a JVM for the user. User can modify JVM options, program 
> properties and JAVA classpath. At the same time, a monitor class will be 
> starting in the JVM witch will register a node on Zookeeper for this JVM as 
> well as metadata info( process id, server info and so on) and the node will 
> be removed while JVM exiting. It allows customer to specify JVM idle time in 
> case of some realtime senarior like Kafka consumer. 
> 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no 
> long in use; Restart JVMs with latest JARs and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)