[jira] [Commented] (TRAFODION-1703) Lower overhead in deleting old Tlog entries
[ https://issues.apache.org/jira/browse/TRAFODION-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123984#comment-15123984 ] Trina Krug commented on TRAFODION-1703: --- The change looks good to me. > Lower overhead in deleting old Tlog entries > --- > > Key: TRAFODION-1703 > URL: https://issues.apache.org/jira/browse/TRAFODION-1703 > Project: Apache Trafodion > Issue Type: Improvement > Components: dtm >Affects Versions: 1.3-incubating >Reporter: Sean Broeder >Assignee: Sean Broeder > Fix For: 1.3-incubating > > Original Estimate: 336h > Remaining Estimate: 336h > > Currently, Tlog entries are maintained for a period of time when there is a > chance they might be needed to help recover from some sort of failure. After > that widow expires the old records can be deleted. > The current mechanism starts a scanner on the region which sends every record > to the client Tlog component where the record is evaluated and if the > deletion criteria are met a Delete is created and sent back to the region. > This creates a lot of unnecessary message traffic since none of the contents > of the record is actually needed by the Tlog component during the maintenance > operation. > It would be better to send the deletion criteria to the region so that the > region itself to perform the necessary housekeeping without sending all the > data back to the client Tlog component. > I believe the endpoint coprocessor service could be used to perform this work. > Comments? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TRAFODION-1648) Synchronization issues in TrxRegion* coprocessor code for Region split/balance
[ https://issues.apache.org/jira/browse/TRAFODION-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trina Krug resolved TRAFODION-1648. --- Resolution: Fixed > Synchronization issues in TrxRegion* coprocessor code for Region > split/balance > --- > > Key: TRAFODION-1648 > URL: https://issues.apache.org/jira/browse/TRAFODION-1648 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm >Reporter: Trina Krug >Assignee: Trina Krug > > --New transactions become disabled in preClose/preSplit, but other requests > such as commitRequest are allowed to get through after this point. This > poses an issue when gathering/writing transaction state at flush time. A > slight design modification is required. > --In addition, detailed review and possible modification of shared lists > should be done in this region split/balance area, including those affected > above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1885) Online node expansion
Trina Krug created TRAFODION-1885: - Summary: Online node expansion Key: TRAFODION-1885 URL: https://issues.apache.org/jira/browse/TRAFODION-1885 Project: Apache Trafodion Issue Type: Improvement Components: foundation Reporter: Trina Krug Assignee: Trina Krug Need the ability to add nodes to the existing online configuration without an outage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TRAFODION-1885) Online node expansion
[ https://issues.apache.org/jira/browse/TRAFODION-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trina Krug updated TRAFODION-1885: -- Description: Need the ability to add nodes without an outage. (was: Need the ability to add nodes to the existing online configuration without an outage.) > Online node expansion > - > > Key: TRAFODION-1885 > URL: https://issues.apache.org/jira/browse/TRAFODION-1885 > Project: Apache Trafodion > Issue Type: Improvement > Components: foundation >Reporter: Trina Krug >Assignee: Trina Krug > > Need the ability to add nodes without an outage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1885) Online node expansion
[ https://issues.apache.org/jira/browse/TRAFODION-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193571#comment-15193571 ] Trina Krug commented on TRAFODION-1885: --- Initial Configuration/Pre-work: * A static initial configuration can be made to include a larger set of nodes than what will initially be used. The node names do not need to be known as those can be changed at a later date through the shell. * The sqconfig file will need to contain the max number of configurable nodes * an Environment variable (TRAF_EXCLUDE_LIST) will be set in /etc/trafodion/trafodion_config and contain a list of space separated node names to be excluded upon in initial startup (TRAF_EXCLUDE_LIST=nodenameX nodenameY...). Each node needs an up to date version of this file. * the DCS servers file will need to be modified to include only those active nodes at initial startup time ($DCS_INSTALL_DIR/conf/servers). Each nodes needs an up to date version of this file. To add a node online: * Modify TRAF_EXCLUDE_LIST to remove a node (one at a time) that you will be adding. Make sure any shell you use after this point is fresh and has the updated TRAF_EXCLUDE_LIST in its latest form. * With “sqshell -a”, now reintegrate the new node into the cluster by issuing an “up nodename” command * Repeat steps 2 and 3 for each new node. Please wait a few minutes between adding nodes to give the cluster a chance to settle. * Modify dcs servers file ($DCS_INSTALL_DIR/conf/servers) to include the hostname of the new nodes along with the count of servers that need to be started on the new nodes and copy the servers file to all nodes in the cluster. With the help of ‘$DCS_INSTALL_DIR/bin/dcs-daemons.sh start server’ command you can start additional DcsServer and mxosrvrs on the newly added nodes without effecting the mxosrvrs that are already up and running. Verify by issuing ‘dcscheck’ command which will display the count of all DcsServers and mxosrvrs in the cluster > Online node expansion > - > > Key: TRAFODION-1885 > URL: https://issues.apache.org/jira/browse/TRAFODION-1885 > Project: Apache Trafodion > Issue Type: Improvement > Components: foundation >Reporter: Trina Krug >Assignee: Trina Krug > > Need the ability to add nodes without an outage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TRAFODION-1885) Online node expansion
[ https://issues.apache.org/jira/browse/TRAFODION-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trina Krug resolved TRAFODION-1885. --- Resolution: Fixed > Online node expansion > - > > Key: TRAFODION-1885 > URL: https://issues.apache.org/jira/browse/TRAFODION-1885 > Project: Apache Trafodion > Issue Type: Improvement > Components: foundation >Reporter: Trina Krug >Assignee: Trina Krug > > Need the ability to add nodes without an outage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion
[ https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287938#comment-15287938 ] Trina Krug commented on TRAFODION-1988: --- Passed basic TM testing. 1) Test case 1 : TM killed in the middle of transaction. Transaction was aborted, table consistent. 2) Test case 2 : TM killed in the middle of commit processing. Transaction was aborted, table consistent. 3) Test case 3 : Commit conflict. Received appropriate error. Table as expected. 4) Test case 4 : Regionserver killed in the middle of transaction. Transaction aborted, table consistent. > Better java exception handling in the java layer of Trafodion > - > > Key: TRAFODION-1988 > URL: https://issues.apache.org/jira/browse/TRAFODION-1988 > Project: Apache Trafodion > Issue Type: Improvement > Components: dtm, sql-exe >Affects Versions: 2.1-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > > Java exceptions are not handled in consistent manner in Trafodion. The SQL > interface layer in Trafodion is capable of displaying the entire java stack > trace to the client application when an exception is raised in java portion > of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in > Trafodion where the exceptions are not handled in a consistent manner. This > JIRA attempts to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion
[ https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351717#comment-15351717 ] Trina Krug commented on TRAFODION-1988: --- Pull Request #559 Passed basic TM testing. 1) Test case 1 : TM killed in the middle of transaction. Transaction was aborted, table consistent. 2) Test case 2 : TM killed in the middle of commit processing. Transaction was aborted, table consistent. 3) Test case 3 : Commit conflict. Received appropriate error. Table as expected. 4) Test case 4 : Regionserver killed in the middle of transaction. Transaction aborted, table consistent. > Better java exception handling in the java layer of Trafodion > - > > Key: TRAFODION-1988 > URL: https://issues.apache.org/jira/browse/TRAFODION-1988 > Project: Apache Trafodion > Issue Type: Improvement > Components: dtm, sql-exe >Affects Versions: 2.1-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > > Java exceptions are not handled in consistent manner in Trafodion. The SQL > interface layer in Trafodion is capable of displaying the entire java stack > trace to the client application when an exception is raised in java portion > of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in > Trafodion where the exceptions are not handled in a consistent manner. This > JIRA attempts to fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TRAFODION-1648) Synchronization issues in TrxRegion* coprocessor code for Region split/balance
Trina Krug created TRAFODION-1648: - Summary: Synchronization issues in TrxRegion* coprocessor code for Region split/balance Key: TRAFODION-1648 URL: https://issues.apache.org/jira/browse/TRAFODION-1648 Project: Apache Trafodion Issue Type: Bug Components: dtm Reporter: Trina Krug --New transactions become disabled in preClose/preSplit, but other requests such as commitRequest are allowed to get through after this point. This poses an issue when gathering/writing transaction state at flush time. A slight design modification is required. --In addition, detailed review and possible modification of shared lists should be done in this region split/balance area, including those affected above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1626) DTMCI STATS shows incredibly high transaction counts while sqstart is in progress
[ https://issues.apache.org/jira/browse/TRAFODION-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032597#comment-15032597 ] Trina Krug commented on TRAFODION-1626: --- Error checking is not being done in the "json" case for the TMSTATS call. Moved the error checking outside the (json == false) case. > DTMCI STATS shows incredibly high transaction counts while sqstart is in > progress > - > > Key: TRAFODION-1626 > URL: https://issues.apache.org/jira/browse/TRAFODION-1626 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm >Reporter: Venkat Muthuswamy >Assignee: Trina Krug > > When 'dtmci stats -j' command is invoked while the trafodion instance is > still starting up, we see incredibly high numbers for the transaction counts. > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140727707549696,"txnAborts": 140334087297536,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}}] > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140732063999184,"txnAborts": 140429416520192,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}}] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1626) DTMCI STATS shows incredibly high transaction counts while sqstart is in progress
[ https://issues.apache.org/jira/browse/TRAFODION-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032768#comment-15032768 ] Trina Krug commented on TRAFODION-1626: --- I followed the model for process_statustm where it simply printed an error in both cases regardless. IF this is incorrect, I can correct both cases. There is, however, another case that does it altogether different and simply prints out null/zero type values in its formatted data (as well as NOTAVILABLE, etc..). Venkat - I'd like your input as to which way is best/expected. > DTMCI STATS shows incredibly high transaction counts while sqstart is in > progress > - > > Key: TRAFODION-1626 > URL: https://issues.apache.org/jira/browse/TRAFODION-1626 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm >Reporter: Venkat Muthuswamy >Assignee: Trina Krug > > When 'dtmci stats -j' command is invoked while the trafodion instance is > still starting up, we see incredibly high numbers for the transaction counts. > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140727707549696,"txnAborts": 140334087297536,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}}] > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140732063999184,"txnAborts": 140429416520192,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}}] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TRAFODION-1626) DTMCI STATS shows incredibly high transaction counts while sqstart is in progress
[ https://issues.apache.org/jira/browse/TRAFODION-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trina Krug resolved TRAFODION-1626. --- Resolution: Fixed Simple string error returned in both regular and json cases when TMSTATS returns error. > DTMCI STATS shows incredibly high transaction counts while sqstart is in > progress > - > > Key: TRAFODION-1626 > URL: https://issues.apache.org/jira/browse/TRAFODION-1626 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm >Reporter: Venkat Muthuswamy >Assignee: Trina Krug > > When 'dtmci stats -j' command is invoked while the trafodion instance is > still starting up, we see incredibly high numbers for the transaction counts. > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140727707549696,"txnAborts": 140334087297536,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140727707549696,"txnAborts": > 140334087297536,"txnCommits": 0}}] > $ dtmci stats -j > [{"node": 0,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}},{"node": 1,"txnStats":{"txnBegins": > 140732063999184,"txnAborts": 140429416520192,"txnCommits": 0}},{"node": > 2,"txnStats":{"txnBegins": 140732063999184,"txnAborts": > 140429416520192,"txnCommits": 0}}] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-1648) Synchronization issues in TrxRegion* coprocessor code for Region split/balance
[ https://issues.apache.org/jira/browse/TRAFODION-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041903#comment-15041903 ] Trina Krug commented on TRAFODION-1648: --- Will fix/change the following 2 things for this JIRA: 1) Disable all non phase 2 transactional operations earlier in the split/rebalance process. There will be a tiered approach for disabling requests with the goal to minimize transaction outage and to allow the region operation to go as quickly as possible. This piece will address : -- synchronization issues -- the potential of prolonging a split/rebalance indefinitely -- missing a transaction changing state due to timing windows 2) Fix data synchronization issues for the active transaction list, commit pending list and scanners list at split/rebalance time (should be solved by the above, but for completeness sake it will be addressed) > Synchronization issues in TrxRegion* coprocessor code for Region > split/balance > --- > > Key: TRAFODION-1648 > URL: https://issues.apache.org/jira/browse/TRAFODION-1648 > Project: Apache Trafodion > Issue Type: Bug > Components: dtm >Reporter: Trina Krug >Assignee: Trina Krug > > --New transactions become disabled in preClose/preSplit, but other requests > such as commitRequest are allowed to get through after this point. This > poses an issue when gathering/writing transaction state at flush time. A > slight design modification is required. > --In addition, detailed review and possible modification of shared lists > should be done in this region split/balance area, including those affected > above. -- This message was sent by Atlassian JIRA (v6.3.4#6332)