[jira] [Created] (TRAFODION-1586) Add support to create an external Trafodion table and map it to a native HBase table
Anoop Sharma created TRAFODION-1586: --- Summary: Add support to create an external Trafodion table and map it to a native HBase table Key: TRAFODION-1586 URL: https://issues.apache.org/jira/browse/TRAFODION-1586 Project: Apache Trafodion Issue Type: New Feature Reporter: Anoop Sharma Assignee: Anoop Sharma Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1587) Update of primary key on table with index when set clause has subquery gives wrong result
Suresh Subbiah created TRAFODION-1587: - Summary: Update of primary key on table with index when set clause has subquery gives wrong result Key: TRAFODION-1587 URL: https://issues.apache.org/jira/browse/TRAFODION-1587 Project: Apache Trafodion Issue Type: Bug Components: sql-cmp Affects Versions: 1.2-incubating Reporter: Suresh Subbiah Assignee: Suresh Subbiah Updating primary key of a table with a) an index b) using a self-referencing subquery in set clause gives wrong result as shown below, sometimes. Problem found by Selva and analyzed by Dave Birdsall and Selva. The Hash join in plan causes the delete to occur before subquery is evaluated, even though subquery scan is early in the the plan (node 1). A fix will attempt to change the hash join to a tsj . set schema mytest; create schema mytest; create table mytable (c1 char(1), c2 integer not null primary key); CREATE INDEX MYTABLE_IDX ON MYTABLE(C1 ASC); insert into mytable values ('A', 100), ('B', 200), ('C', 300); select * from mytable order by 1; prepare xx from update mytable set c2 = (select c from (select count(distinct c2) from mytable where c1 = 'A') dt(c)) where c2 = 100 ; explain options 'f' xx ; execute xx ; >>explain options 'f' xx ; LC RC OP OPERATOR OPT DESCRIPTION CARD - 12 .13 rootx 1.00E+000 10 11 12 nested_join 1.00E+000 ..11 trafodion_insertMYTABLE_IDX 1.00E+000 8910 nested_join 1.00E+000 ..9trafodion_insertMYTABLE 1.00E+000 7.8sort 1.00E+000 637hybrid_hash_join 1.00E+000 456nested_anti_semi_joi 1.00E+000 ..5trafodion_deleteMYTABLE_IDX 1.00E+000 ..4trafodion_deleteMYTABLE 1.00E+000 2.3sort_scalar_aggr 1.00E+000 1.2sort_scalar_aggr 1.00E+000 ..1trafodion_index_scanMYTABLE_IDX 1.00E+001 >>select * from mytable ; C1 C2 -- --- A 0 B 200 C 300 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TRAFODION-1585) MDAM plan is not chosen unless NJ is turned off
[ https://issues.apache.org/jira/browse/TRAFODION-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Wayne Birdsall reassigned TRAFODION-1585: --- Assignee: David Wayne Birdsall > MDAM plan is not chosen unless NJ is turned off > --- > > Key: TRAFODION-1585 > URL: https://issues.apache.org/jira/browse/TRAFODION-1585 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-cmp >Reporter: Qifan Chen >Assignee: David Wayne Birdsall > > MDAM is used in scan node #8 in the following plan. > >explain options 'f' xx; > LC RC OP OPERATOR OPT DESCRIPTION CARD > - > 15 .16 root 1.00E+000 > 14 715 hybrid_hash_join 1.00E+000 > 13 .14 hash_partial_groupby 1.00E+000 > 12 .13 esp_exchange1:21(hash2) 1.00E+000 > 11 .12 sort_partial_groupby 1.00E+000 > 10 .11 hash_groupby 6.60E+004 > 9.10 hash_groupby 1.35E+005 > 8.9esp_exchange21(hash2):16(hash2) 1.40E+006 > ..8trafodion_scan OP 1.40E+006 > 6.7sort_groupby 1.00E+000 > 5.6hash_partial_groupby 3.09E+001 > 4.5esp_exchange1:21(hash2) 3.09E+001 > 3.4hash_partial_groupby 3.09E+001 > 2.3hash_groupby 1.35E+005 > 1.2esp_exchange21(hash2):16(hash2) 1.40E+006 > ..1trafodion_scan OP 1.40E+006 > The same MDAM scan is absent if CQD NESTED_JOINS 'off' is not used. > TRAFODION_SCAN SEQ_NO 8NO CHILDREN > TABLE_NAME ... P > REQUESTS_IN .. 1 > ROWS_OUT . 1,401,802 > EST_OPER_COST 0.21 > EST_TOTAL_COST ... 0.21 > DESCRIPTION > max_card_est ... 1.4018e+06 > fragment_id 5 > parent_frag 4 > fragment_type .. esp > scan_type .. subset scan limited by mdam of table > OP > object_type Trafodion > cache_size 10,000 > probes . 1 > rows_accessed .. 1.4018e+06 > key_columns _SALT_, SITE_ID, PANEL, DWD_ID > executor_predicates (PANEL = '1') and (SITE_ID = 450) > mdam_disjunct .. (PANEL = '1') and (SITE_ID = 450) and (_SALT_ > >= (\:_sys_HostVarLoHashPart Hash2Distrib 64)) > and > (_SALT_ <= (\:_sys_HostVarHiHashPart Hash2Distrib > 64)) > part_key_predicates (PANEL = '1' = PANEL) and (SITE_ID = 450 > = SITE_ID) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1585) MDAM plan is not chosen unless NJ is turned off
Qifan Chen created TRAFODION-1585: - Summary: MDAM plan is not chosen unless NJ is turned off Key: TRAFODION-1585 URL: https://issues.apache.org/jira/browse/TRAFODION-1585 Project: Apache Trafodion Issue Type: Bug Components: sql-cmp Reporter: Qifan Chen MDAM is used in scan node #8 in the following plan. >explain options 'f' xx; LC RC OP OPERATOR OPT DESCRIPTION CARD - 15 .16 root 1.00E+000 14 715 hybrid_hash_join 1.00E+000 13 .14 hash_partial_groupby 1.00E+000 12 .13 esp_exchange1:21(hash2) 1.00E+000 11 .12 sort_partial_groupby 1.00E+000 10 .11 hash_groupby 6.60E+004 9.10 hash_groupby 1.35E+005 8.9esp_exchange21(hash2):16(hash2) 1.40E+006 ..8trafodion_scan OP 1.40E+006 6.7sort_groupby 1.00E+000 5.6hash_partial_groupby 3.09E+001 4.5esp_exchange1:21(hash2) 3.09E+001 3.4hash_partial_groupby 3.09E+001 2.3hash_groupby 1.35E+005 1.2esp_exchange21(hash2):16(hash2) 1.40E+006 ..1trafodion_scan OP 1.40E+006 The same MDAM scan is absent if CQD NESTED_JOINS 'off' is not used. TRAFODION_SCAN SEQ_NO 8NO CHILDREN TABLE_NAME ... P REQUESTS_IN .. 1 ROWS_OUT . 1,401,802 EST_OPER_COST 0.21 EST_TOTAL_COST ... 0.21 DESCRIPTION max_card_est ... 1.4018e+06 fragment_id 5 parent_frag 4 fragment_type .. esp scan_type .. subset scan limited by mdam of table OP object_type Trafodion cache_size 10,000 probes . 1 rows_accessed .. 1.4018e+06 key_columns _SALT_, SITE_ID, PANEL, DWD_ID executor_predicates (PANEL = '1') and (SITE_ID = 450) mdam_disjunct .. (PANEL = '1') and (SITE_ID = 450) and (_SALT_ >= (\:_sys_HostVarLoHashPart Hash2Distrib 64)) and (_SALT_ <= (\:_sys_HostVarHiHashPart Hash2Distrib 64)) part_key_predicates (PANEL = '1' = PANEL) and (SITE_ID = 450 = SITE_ID) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1576) Performance improvement and reducing offline interval for backup
[ https://issues.apache.org/jira/browse/TRAFODION-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990209#comment-14990209 ] ASF GitHub Bot commented on TRAFODION-1576: --- Github user selvaganesang commented on a diff in the pull request: https://github.com/apache/incubator-trafodion/pull/158#discussion_r43925129 --- Diff: core/sqf/hbase_utilities/src/main/java/org/trafodion/utility/backuprestore/TrafExportSnapshot.java --- @@ -0,0 +1,1076 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.trafodion.utility.backuprestore; + +import java.io.BufferedInputStream; +import java.io.FileNotFoundException; +import java.io.DataInput; +import java.io.DataOutput; +import java.io.IOException; +import java.io.InputStream; +import java.net.URI; +import java.util.ArrayList; +import java.util.Collections; +import java.util.Comparator; +import java.util.LinkedList; +import java.util.List; +import java.util.Random; + +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileChecksum; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FileUtil; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.HBaseConfiguration; +import org.apache.hadoop.hbase.HConstants; +import org.apache.hadoop.hbase.HRegionInfo; +import org.apache.hadoop.hbase.io.FileLink; +import org.apache.hadoop.hbase.io.HFileLink; +import org.apache.hadoop.hbase.io.HLogLink; +import org.apache.hadoop.hbase.io.hadoopbackport.ThrottledInputStream; +import org.apache.hadoop.hbase.mapreduce.JobUtil; +import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil; +import org.apache.hadoop.hbase.protobuf.generated.HBaseProtos.SnapshotDescription; +import org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos.SnapshotFileInfo; +import org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos.SnapshotRegionManifest; +import org.apache.hadoop.hbase.util.EnvironmentEdgeManager; +import org.apache.hadoop.hbase.util.FSUtils; +import org.apache.hadoop.hbase.snapshot.*; +import org.apache.hadoop.hbase.util.Pair; +import org.apache.hadoop.io.BytesWritable; +import org.apache.hadoop.io.NullWritable; +import org.apache.hadoop.io.SequenceFile; +import org.apache.hadoop.io.Writable; +import org.apache.hadoop.mapreduce.Job; +import org.apache.hadoop.mapreduce.JobContext; +import org.apache.hadoop.mapreduce.Mapper; +import org.apache.hadoop.mapreduce.InputFormat; +import org.apache.hadoop.mapreduce.InputSplit; +import org.apache.hadoop.mapreduce.RecordReader; +import org.apache.hadoop.mapreduce.TaskAttemptContext; +import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat; +import org.apache.hadoop.mapreduce.lib.output.NullOutputFormat; +import org.apache.hadoop.mapreduce.security.TokenCache; +import org.apache.hadoop.util.StringUtils; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +/** + * Export the specified snapshot to a given FileSystem. + * + * The .snapshot/name folder is copied to the destination cluster + * and then all the hfiles/hlogs are copied using a Map-Reduce Job in the .archive/ location. + * When everything is done, the second cluster can restore the snapshot. + */ +@InterfaceAudience.Public +@InterfaceStability.Evolving +public class
[jira] [Commented] (TRAFODION-1578) Proposal for SPJ management
[ https://issues.apache.org/jira/browse/TRAFODION-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990060#comment-14990060 ] Suresh Subbiah commented on TRAFODION-1578: --- The JAR perform process is slightly different for CALL statements as described above. Currently what we have is DcsMaster->DcsServer->mxosrvr (these three associations occur when the trafci, JDBC (T4), ODBC app connects) When the first CALL statement is made from this connection an MXUDR process is associated with mxosrvr. mxosrvr acts as master executor for the CALL statement. MXUDR process will host the jvm that will execute the SPJ. If the SPJ contains SQL, then the mxudr process itself acts as a master executor (with JDBC T2). Mxudr processes can be reused between different statements in the same connection. In this case the same JVM will be used. In general JVM management (for those used by SQL queries) is done by the Trafodion engine. Is the proposal here suggesting that users may want to control that more directly? Thank you. > Proposal for SPJ management > --- > > Key: TRAFODION-1578 > URL: https://issues.apache.org/jira/browse/TRAFODION-1578 > Project: Apache Trafodion > Issue Type: Improvement > Components: connectivity-dcs >Reporter: Kevin Xu > > JAR upload process: > 1. Initialize JAR upload procedure by default > 2. JAR upload by Trafci(add library LIB_NAME JAR_LOCAL_PATH). Upload and > create library will be done here. And also, you can only upload the JARs by > UPLOAD command on Trafci that it will not create a lib. >Tips: Before put the JAR into HDFS check MD5 first, if the file exists, > only add a record in metadata table in case users upload the same JAR many > times on platform. > 3. On server-side, the JAR will store in HDFS. At the same time JAR > metadata(path in HDFS, MD5 of the file, and others) stores in store procedure > metadata table. > 4. create procedure is the same as now. > JAR perform process: > 1. Send a CALL by Trafci/JDBC/ODBC/ADO.NET. > 2. DCSMaster assign a DCSServer for the CALL. > 3. DCSServer start a JVM for the user. User can modify JVM options, program > properties and JAVA classpath. At the same time, a monitor class will be > starting in the JVM witch will register a node on Zookeeper for this JVM as > well as metadata info( process id, server info and so on) and the node will > be removed while JVM exiting. It allows customer to specify JVM idle time in > case of some realtime senarior like Kafka consumer. > 4. Useful commands on Trafci: list all JVMs in user; kill one of them that no > long in use; Restart JVMs with latest JARs and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)