[jira] [Commented] (AIRAVATA-2737) Too many Zookeeper connections created
[ https://issues.apache.org/jira/browse/AIRAVATA-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424534#comment-16424534 ] Dimuthu Upeksha commented on AIRAVATA-2737: --- Fixed in https://github.com/apache/airavata/commit/8f7dc3dc8889bd21cb00911d323a66721a960c81 > Too many Zookeeper connections created > -- > > Key: AIRAVATA-2737 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2737 > Project: Airavata > Issue Type: Bug > Components: helix implementation >Affects Versions: 0.18 >Reporter: Eroma >Assignee: Dimuthu Upeksha >Priority: Major > Fix For: 0.18 > > > For each task in a workflow a zookeeper connection is opened. This creates > too many zookeeper connections and some experiments are not moving pass > LAUNCHED as a result -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRAVATA-2734) Experiment status in LAUNCEHD while job is in ACTIVE. Experiment status should be EXECUTING.
[ https://issues.apache.org/jira/browse/AIRAVATA-2734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424532#comment-16424532 ] Dimuthu Upeksha commented on AIRAVATA-2734: --- Fixed in https://github.com/apache/airavata/commit/bf3943a37fc182e7ad884c9683e8563f4bc29d5b > Experiment status in LAUNCEHD while job is in ACTIVE. Experiment status > should be EXECUTING. > > > Key: AIRAVATA-2734 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2734 > Project: Airavata > Issue Type: Bug > Components: helix implementation >Affects Versions: 0.18 >Reporter: Eroma >Assignee: Dimuthu Upeksha >Priority: Major > Fix For: 0.18 > > > Experiment status should change to EXECUTING when it is picked up by helix. > Once the status changes to EXECUTING the job status will get changed to > SUBMITTED, QUEUED and ACTIVE. > Once the job is COMPLETED, experiment status will change to COMPLETED after > the output files transfers are completed. > > Currently experiment status is LAUNCHED but the job is submitted and running. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (AIRAVATA-2733) Improvements to Helix log messages
[ https://issues.apache.org/jira/browse/AIRAVATA-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424528#comment-16424528 ] Dimuthu Upeksha commented on AIRAVATA-2733: --- Fixed in [https://github.com/apache/airavata/commit/8f7dc3dc8889bd21cb00911d323a66721a960c81] https://github.com/apache/airavata/commit/55747caf5f11ebfb2507d96e83f61a9938ceb857 > Improvements to Helix log messages > -- > > Key: AIRAVATA-2733 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2733 > Project: Airavata > Issue Type: Improvement > Components: helix implementation >Affects Versions: 0.18 >Reporter: Eroma >Assignee: Dimuthu Upeksha >Priority: Major > Fix For: 0.18 > > > New additions to the current Helix log messages > # Add the job submission command to the log. Currently it is not there and > only the job status is there. > # Print the complete job submission response from the cluster, this is > useful when an experiment and/or job fails to investigate. > # Print both token and description on the log for the credential store token > in use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRAVATA-2737) Too many Zookeeper connections created
Eroma created AIRAVATA-2737: --- Summary: Too many Zookeeper connections created Key: AIRAVATA-2737 URL: https://issues.apache.org/jira/browse/AIRAVATA-2737 Project: Airavata Issue Type: Bug Components: helix implementation Affects Versions: 0.18 Reporter: Eroma Assignee: Dimuthu Upeksha Fix For: 0.18 For each task in a workflow a zookeeper connection is opened. This creates too many zookeeper connections and some experiments are not moving pass LAUNCHED as a result -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRAVATA-2736) Job submitted and running in HPC while the experiment is tagged as FAILED
Eroma created AIRAVATA-2736: --- Summary: Job submitted and running in HPC while the experiment is tagged as FAILED Key: AIRAVATA-2736 URL: https://issues.apache.org/jira/browse/AIRAVATA-2736 Project: Airavata Issue Type: Bug Components: helix implementation Affects Versions: 0.18 Environment: http://149.165.168.248:8008/ - Helix test env Reporter: Eroma Assignee: Dimuthu Upeksha Fix For: 0.18 # Submitted an experiment which then submitted the job. # Job ID is returned and the status is ACTIVE. # Due to zookeeper connection issue the experiment is FAILED. # The job is still running in HPC # Airavata is not waiting for job monitoring as the task status is not updated in the zookeeper. # error in log [1] # SLM001-AmberSander-BR2_5ed5a19f-ab44-4eba-afb7-1feafaf0bbdd - exp ID [1] |org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /monitoring/2159926/lock at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:778) at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:696) at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:679) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:676) at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453) at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443) at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) at org.apache.airavata.helix.impl.task.submission.JobSubmissionTask.createMonitoringNode(JobSubmissionTask.java:83) at org.apache.airavata.helix.impl.task.submission.DefaultJobSubmissionTask.onRun(DefaultJobSubmissionTask.java:144) at org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:264) at org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:74) at org.apache.helix.task.TaskRunner.run(TaskRunner.java:70) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRAVATA-2735) When transferring input files, check for the file size and 0 byte files transfers should be restricted
Eroma created AIRAVATA-2735: --- Summary: When transferring input files, check for the file size and 0 byte files transfers should be restricted Key: AIRAVATA-2735 URL: https://issues.apache.org/jira/browse/AIRAVATA-2735 Project: Airavata Issue Type: Improvement Components: helix implementation Affects Versions: 0.18 Reporter: Eroma Assignee: Dimuthu Upeksha Fix For: 0.18 # When transferring input files if the file is 0 in size, file transfer task should fail and experiment should fail. # User should be notified about the file being empty. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRAVATA-2734) Experiment status in LAUNCEHD while job is in ACTIVE. Experiment status should be EXECUTING.
Eroma created AIRAVATA-2734: --- Summary: Experiment status in LAUNCEHD while job is in ACTIVE. Experiment status should be EXECUTING. Key: AIRAVATA-2734 URL: https://issues.apache.org/jira/browse/AIRAVATA-2734 Project: Airavata Issue Type: Bug Components: helix implementation Affects Versions: 0.18 Reporter: Eroma Assignee: Dimuthu Upeksha Fix For: 0.18 Experiment status should change to EXECUTING when it is picked up by helix. Once the status changes to EXECUTING the job status will get changed to SUBMITTED, QUEUED and ACTIVE. Once the job is COMPLETED, experiment status will change to COMPLETED after the output files transfers are completed. Currently experiment status is LAUNCHED but the job is submitted and running. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (AIRAVATA-2733) Improvements to Helix log messages
Eroma created AIRAVATA-2733: --- Summary: Improvements to Helix log messages Key: AIRAVATA-2733 URL: https://issues.apache.org/jira/browse/AIRAVATA-2733 Project: Airavata Issue Type: Improvement Components: helix implementation Affects Versions: 0.18 Reporter: Eroma Assignee: Dimuthu Upeksha Fix For: 0.18 New additions to the current Helix log messages # Add the job submission command to the log. Currently it is not there and only the job status is there. # Print the complete job submission response from the cluster, this is useful when an experiment and/or job fails to investigate. # Print both token and description on the log for the credential store token in use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (AIRAVATA-2590) Update UGE_groovy.template to apply different parallel environment (-pe) values
[ https://issues.apache.org/jira/browse/AIRAVATA-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Christie resolved AIRAVATA-2590. --- Resolution: Fixed > Update UGE_groovy.template to apply different parallel environment (-pe) > values > --- > > Key: AIRAVATA-2590 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2590 > Project: Airavata > Issue Type: Bug >Reporter: Marcus Christie >Assignee: Marcus Christie >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRAVATA-2716) In airavata desktop client in CILogon search and 'remember this selection doesn't work'
[ https://issues.apache.org/jira/browse/AIRAVATA-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Christie reassigned AIRAVATA-2716: - Assignee: (was: Marcus Christie) > In airavata desktop client in CILogon search and 'remember this selection > doesn't work' > --- > > Key: AIRAVATA-2716 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2716 > Project: Airavata > Issue Type: Bug > Components: Airavata Desktop Client >Affects Versions: 0.18 >Reporter: Eroma >Priority: Major > Fix For: 0.18 > > Attachments: Screen Shot 2018-03-16 at 3.56.57 PM.png > > > In desktop client > # Click Login > # Click button 'Sign in with CILogon' > # Start typing institue name in the 'Search' > # Above institute list vanishes. > # Instead user need to navigate the list and find the institute. > # Once found it, clicked 'Remember this selection' > # Next time when CILogon was clicked the previous selection is not > remembered. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (AIRAVATA-2714) Exception thrown when user clones an old experiment where application is empty in experiment browse
[ https://issues.apache.org/jira/browse/AIRAVATA-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Christie reassigned AIRAVATA-2714: - Assignee: (was: Marcus Christie) > Exception thrown when user clones an old experiment where application is > empty in experiment browse > --- > > Key: AIRAVATA-2714 > URL: https://issues.apache.org/jira/browse/AIRAVATA-2714 > Project: Airavata > Issue Type: Bug > Components: Airavata System, PGA PHP Web Gateway >Affects Versions: 0.18 > Environment: https://dev.seagrid.org and https://seagrid.org >Reporter: Eroma >Priority: Major > Fix For: 0.18 > > Attachments: Screen Shot 2018-03-16 at 1.38.02 PM.png > > > The scenario is applicable for HPCs which gets decommissioned. > # Application deployment is deleted as the machine is no longer. > # When user clones it there is a proper error message saying that the > deployment is no longer. > # This is acceptable but still create a cloned experiment. > > # When the application interface is deleted, > # In experiment browse, experiment summary and in detailed experiment > summary the application field is empty. > # When the user clones it, exception thorwn > ## ErrorException (E_NOTICE) > Undefined variable: appId > # E.g.: gamess_stampede > # Actually the cloned experiment is created the exception is thrown in edit > experiment window. > h3. -- This message was sent by Atlassian JIRA (v7.6.3#76005)