[jira] [Commented] (TRAFODION-1609) Change swstatus to give a simple "up" or "down" answer, instead of the current, confusing display

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352317#comment-15352317
 ] 

ASF GitHub Bot commented on TRAFODION-1609:
---

Github user mkby commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/562#discussion_r68695568
  
--- Diff: core/sqf/sql/scripts/install_local_hadoop ---
@@ -847,20 +847,31 @@ EOF
 
   cat <$MY_SW_SCRIPTS_DIR/swstatus
 #!/bin/sh
-cd ${MY_SW_ROOT}
-. $MY_SW_SCRIPTS_DIR/sw_env.sh
-NUM_JAVA_PROCS=\`ps -aef | grep \$USER | grep java | grep -v grep | wc -l\`
-NUM_MYSQLD_PROCS=\`ps -aef | grep \$USER | grep mysqld | grep -v grep | wc 
-l\`
+cd \${MY_SW_ROOT}
+. \$MY_SW_SCRIPTS_DIR/sw_env.sh
+JPS_OUTPUT=\`jps\`
+SERVICES='HMaster NodeManager ResourceManager NameNode DataNode 
SecondaryNameNode'
+for s in \$SERVICES; do
+if [[ ! \$JPS_OUTPUT =~ \$s ]]; then
+MISS_SERVICE="\$s \$MISS_SERVICE"
+fi
+done
 
-if [ "\$1" == "-v" ]; then
-  ps -aef | grep \$USER | grep java | grep -v grep
-  ps -aef | grep \$USER | grep mysqld | grep -v grep
+if [[ \$MISS_SERVICE != '' ]]; then
+echo "ERROR: Service \"\$MISS_SERVICE\" are not up!"
+exit 1
+else
+echo "The local hadoop services are up!"
 fi
 
-echo "\$NUM_JAVA_PROCS java servers and \$NUM_MYSQLD_PROCS mysqld 
processes are running"
-
-jps | grep -v Jps
-
+NUM_MYSQLD_PROCS=\`ps -aef | grep \$USER | grep mysqld | grep -v grep | wc 
-l\`
+if [[ \$NUM_MYSQLD_PROCS -ne 0 ]]; then
+echo "\$NUM_MYSQLD_PROCS mysqld processes are running"
+exit 0
+else
+echo "ERROR: mysqld process is not running!"
+exit 1
+fi
 EOF
 
--- End diff --

What are the other generated scripts? 


> Change swstatus to give a simple "up" or "down" answer, instead of the 
> current, confusing display
> -
>
> Key: TRAFODION-1609
> URL: https://issues.apache.org/jira/browse/TRAFODION-1609
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Reporter: Hans Zeller
>Assignee: Eason Zhang
>Priority: Minor
>
> The "swstatus" script used in cases where we did install_local_hadoop counts 
> the MySQL processes and then simply invokes jps.
> It would be better if the script would check specifically for HMaster, 
> NameNode, DataNode and maybe others, like NodeManager and ResourceManager. If 
> any are missing, it should print out which ones are missing, otherwise it 
> should just say something like "processes are up".
> It should also return an exit code of 0 if everything is up and 1 if some 
> processes are missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread Trina Krug (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351717#comment-15351717
 ] 

Trina Krug commented on TRAFODION-1988:
---

Pull Request #559

Passed basic TM testing.
1) Test case 1 : TM killed in the middle of transaction. Transaction was 
aborted, table consistent.
2) Test case 2 : TM killed in the middle of commit processing. Transaction was 
aborted, table consistent.
3) Test case 3 : Commit conflict. Received appropriate error. Table as expected.
4) Test case 4 : Regionserver killed in the middle of transaction. Transaction 
aborted, table consistent.

> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1609) Change swstatus to give a simple "up" or "down" answer, instead of the current, confusing display

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351611#comment-15351611
 ] 

ASF GitHub Bot commented on TRAFODION-1609:
---

Github user selvaganesang commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/562#discussion_r68635524
  
--- Diff: core/sqf/sql/scripts/install_local_hadoop ---
@@ -847,20 +847,31 @@ EOF
 
   cat <$MY_SW_SCRIPTS_DIR/swstatus
 #!/bin/sh
-cd ${MY_SW_ROOT}
-. $MY_SW_SCRIPTS_DIR/sw_env.sh
-NUM_JAVA_PROCS=\`ps -aef | grep \$USER | grep java | grep -v grep | wc -l\`
-NUM_MYSQLD_PROCS=\`ps -aef | grep \$USER | grep mysqld | grep -v grep | wc 
-l\`
+cd \${MY_SW_ROOT}
+. \$MY_SW_SCRIPTS_DIR/sw_env.sh
+JPS_OUTPUT=\`jps\`
+SERVICES='HMaster NodeManager ResourceManager NameNode DataNode 
SecondaryNameNode'
+for s in \$SERVICES; do
+if [[ ! \$JPS_OUTPUT =~ \$s ]]; then
+MISS_SERVICE="\$s \$MISS_SERVICE"
+fi
+done
 
-if [ "\$1" == "-v" ]; then
-  ps -aef | grep \$USER | grep java | grep -v grep
-  ps -aef | grep \$USER | grep mysqld | grep -v grep
+if [[ \$MISS_SERVICE != '' ]]; then
+echo "ERROR: Service \"\$MISS_SERVICE\" are not up!"
+exit 1
+else
+echo "The local hadoop services are up!"
 fi
 
-echo "\$NUM_JAVA_PROCS java servers and \$NUM_MYSQLD_PROCS mysqld 
processes are running"
-
-jps | grep -v Jps
-
+NUM_MYSQLD_PROCS=\`ps -aef | grep \$USER | grep mysqld | grep -v grep | wc 
-l\`
+if [[ \$NUM_MYSQLD_PROCS -ne 0 ]]; then
+echo "\$NUM_MYSQLD_PROCS mysqld processes are running"
+exit 0
+else
+echo "ERROR: mysqld process is not running!"
+exit 1
+fi
 EOF
 
--- End diff --

Is there a need to do it the same way for the other generated scripts too


> Change swstatus to give a simple "up" or "down" answer, instead of the 
> current, confusing display
> -
>
> Key: TRAFODION-1609
> URL: https://issues.apache.org/jira/browse/TRAFODION-1609
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Reporter: Hans Zeller
>Assignee: Eason Zhang
>Priority: Minor
>
> The "swstatus" script used in cases where we did install_local_hadoop counts 
> the MySQL processes and then simply invokes jps.
> It would be better if the script would check specifically for HMaster, 
> NameNode, DataNode and maybe others, like NodeManager and ResourceManager. If 
> any are missing, it should print out which ones are missing, otherwise it 
> should just say something like "processes are up".
> It should also return an exit code of 0 if everything is up and 1 if some 
> processes are missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351497#comment-15351497
 ] 

ASF GitHub Bot commented on TRAFODION-1988:
---

Github user selvaganesang commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/559#discussion_r68625772
  
--- Diff: 
core/sqf/src/seatrans/tm/hbasetmlib2/src/main/java/org/trafodion/dtm/HBaseTxClient.java
 ---
@@ -452,7 +436,13 @@ public short abortTransaction(final long 
transactionID) throws Exception {
   Object commitDDLLock = new Object();
   synchronized(commitDDLLock)
   {
-commitDDLLock.wait();
+ boolean loopBack = false;
+ try {
+commitDDLLock.wait();
+ } catch(InterruptedException ie) {
+ LOG.warn("Interrupting commitDDLLock.wait,  but retrying 
", ie);
+ loopBack = true;
+ } while (loopBack);
--- End diff --

Interesting. It is not clear why didn't Java compiler catch this issue. I 
am fixing the code to ignore InterruptedException.


> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351444#comment-15351444
 ] 

ASF GitHub Bot commented on TRAFODION-1988:
---

Github user DaveBirdsall commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/559#discussion_r68619686
  
--- Diff: 
core/sqf/src/seatrans/tm/hbasetmlib2/src/main/java/org/trafodion/dtm/TmAuditTlog.java
 ---
@@ -961,18 +814,14 @@ public static boolean deleteRecord(final long 
lvTransid) throws IOException {
   if (LOG.isTraceEnabled()) LOG.trace("deleteRecord start " + 
lvTransid);
   String transidString = new String(String.valueOf(lvTransid));
   int lv_lockIndex = (int)(lvTransid & tLogHashKey);
-  try {
+  //try {
--- End diff --

Maybe should remove this comment? (since the corresponding catch block was 
removed)


> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351442#comment-15351442
 ] 

ASF GitHub Bot commented on TRAFODION-1988:
---

Github user DaveBirdsall commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/559#discussion_r68619110
  
--- Diff: 
core/sqf/src/seatrans/tm/hbasetmlib2/src/main/java/org/trafodion/dtm/HBaseTxClient.java
 ---
@@ -577,7 +568,13 @@ public short doCommit(long transactionId) throws 
Exception {
   Object commitDDLLock = new Object();
   synchronized(commitDDLLock)
   {
-commitDDLLock.wait();
+ boolean loopBack = false;
+ try {
+   commitDDLLock.wait();
+ } catch(InterruptedException ie) {
+ LOG.warn("Interrupting commitDDLLock.wait,  but retrying 
", ie);
+ loopBack = true;
+ } while (loopBack);
--- End diff --

Same comment as above.


> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351443#comment-15351443
 ] 

ASF GitHub Bot commented on TRAFODION-1988:
---

Github user DaveBirdsall commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/559#discussion_r68619149
  
--- Diff: 
core/sqf/src/seatrans/tm/hbasetmlib2/src/main/java/org/trafodion/dtm/HBaseTxClient.java
 ---
@@ -597,24 +594,25 @@ public short doCommit(long transactionId) throws 
Exception {
return TransReturnCode.RET_OK.getShort();
}
 
-   public short completeRequest(long transactionId) throws Exception {
+   public short completeRequest(long transactionId) throws IOException, 
CommitUnsuccessfulException {
  if (LOG.isDebugEnabled()) LOG.debug("Enter completeRequest, txid: " + 
transactionId);
  TransactionState ts = mapTransactionStates.get(transactionId);
 
  if(ts == null) {
   LOG.error("Returning from HBaseTxClient:completeRequest, (null 
tx) retval: " + TransReturnCode.RET_NOTX.toString() + " txid: " + 
transactionId);
   return TransReturnCode.RET_NOTX.getShort();
}
-
+  
+   boolean loopBack = false;
try {
 
if (LOG.isTraceEnabled()) LOG.trace("TEMP completeRequest Calling 
CompleteRequest() Txid :" + transactionId);
 
   ts.completeRequest();
-   } catch(Exception e) {
-  LOG.error("Returning from HBaseTxClient:completeRequest, 
ts.completeRequest: txid: " + transactionId + ", EXCEPTION: " + e);
-   throw new Exception("Exception during completeRequest, unable to 
commit.  Exception: " + e);
-   }
+   } catch(InterruptedException ie) {
+  LOG.warn("Interrupting HBaseTxClient:completeRequest but 
retrying, ts.completeRequest: txid: " + transactionId + ", EXCEPTION: ", ie);
+  loopBack = true;
+   } while (loopBack);
--- End diff --

Same comment as above.


> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1988) Better java exception handling in the java layer of Trafodion

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351440#comment-15351440
 ] 

ASF GitHub Bot commented on TRAFODION-1988:
---

Github user DaveBirdsall commented on a diff in the pull request:

https://github.com/apache/incubator-trafodion/pull/559#discussion_r68619027
  
--- Diff: 
core/sqf/src/seatrans/tm/hbasetmlib2/src/main/java/org/trafodion/dtm/HBaseTxClient.java
 ---
@@ -452,7 +436,13 @@ public short abortTransaction(final long 
transactionID) throws Exception {
   Object commitDDLLock = new Object();
   synchronized(commitDDLLock)
   {
-commitDDLLock.wait();
+ boolean loopBack = false;
+ try {
+commitDDLLock.wait();
+ } catch(InterruptedException ie) {
+ LOG.warn("Interrupting commitDDLLock.wait,  but retrying 
", ie);
+ loopBack = true;
+ } while (loopBack);
--- End diff --

This doesn't look right to me. Does "try" have a looping semantic when 
followed by "while"? Or did you intend to code a do { } around the try/catch 
block? (I'm thinking that the "while (loopBack);" does nothing if loopBack is 
false but loops infinitely if loopBack is true.)


> Better java exception handling in the java layer of Trafodion
> -
>
> Key: TRAFODION-1988
> URL: https://issues.apache.org/jira/browse/TRAFODION-1988
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: dtm, sql-exe
>Affects Versions: 2.1-incubating
>Reporter: Selvaganesan Govindarajan
>Assignee: Selvaganesan Govindarajan
>
> Java exceptions are not handled in consistent manner in Trafodion. The SQL 
> interface layer in Trafodion is capable of displaying the entire java stack 
> trace to the client application when an exception is raised in java portion 
> of the Trafodion/Hbase/Hdfs stack. However, there are portions of the code in 
> Trafodion where the exceptions are not handled in a consistent manner. This 
> JIRA attempts to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2069) 'sqcheck' script should not report on hosts in TRAF_EXCLUDE_LIST

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15351391#comment-15351391
 ] 

ASF GitHub Bot commented on TRAFODION-2069:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/556


> 'sqcheck' script should not report on hosts in TRAF_EXCLUDE_LIST
> 
>
> Key: TRAFODION-2069
> URL: https://issues.apache.org/jira/browse/TRAFODION-2069
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: foundation
>Affects Versions: 2.1-incubating
>Reporter: Gonzalo E Correa
>Assignee: Gonzalo E Correa
> Fix For: 2.1-incubating
>
>
> The 'sqcheck' script does not evaluate the contents of the TRAF_EXCLUDE_LIST 
> environment variable when the elasticity feature is used. Script logic 
> calculating configured process type counts needs to account for non-existent 
> nodes to calculate number of configured processes accurately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1609) Change swstatus to give a simple "up" or "down" answer, instead of the current, confusing display

2016-06-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350497#comment-15350497
 ] 

ASF GitHub Bot commented on TRAFODION-1609:
---

GitHub user mkby opened a pull request:

https://github.com/apache/incubator-trafodion/pull/562

[TRAFODION-1609]

Change swstatus to give a simple "up" or "down" answer, instead of the 
current, confusing display

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mkby/incubator-trafodion master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/562.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #562


commit 0e9f43aa18f1984fd1e6c07ddb0c95ecfe84211d
Author: Eason 
Date:   2016-06-27T06:40:07Z

[TRAFODION-1609]

Change swstatus to give a simple "up" or "down" answer, instead of the 
current, confusing display




> Change swstatus to give a simple "up" or "down" answer, instead of the 
> current, confusing display
> -
>
> Key: TRAFODION-1609
> URL: https://issues.apache.org/jira/browse/TRAFODION-1609
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Reporter: Hans Zeller
>Assignee: Eason Zhang
>Priority: Minor
>
> The "swstatus" script used in cases where we did install_local_hadoop counts 
> the MySQL processes and then simply invokes jps.
> It would be better if the script would check specifically for HMaster, 
> NameNode, DataNode and maybe others, like NodeManager and ResourceManager. If 
> any are missing, it should print out which ones are missing, otherwise it 
> should just say something like "processes are up".
> It should also return an exit code of 0 if everything is up and 1 if some 
> processes are missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)