from:"Jira"



 [ 
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Olson updated OOZIE-3723:

Description: 
We recently have encountered two separate issues that both ultimately required 
an Oozie service restart to resolve after extensive troubleshooting. In both 
situations it was apparent that certain workflow-supplied configuration 
properties related to remote FileSystem connectivity to support obtaining HDFS 
credentials for remote clusters (via {{{}mapreduce.job.hdfs-servers{}}}) are 
being retained permanently within some kind of cache in the Oozie service or 
underlying Hadoop code - cached global FileSystem instances perhaps). The 
previously "poisoned" cached values are superseding corrected values after the 
workflow configuration is fixed, giving us no known way to fix the problem 
without restarting the Oozie service. We confirmed that the {{hdfs-site.xml}} 
and {{oozie-site.xml}} files where Oozie is running had not been updated since 
the prior restart, so not a basic case of stale server-side configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to 
{*}/{*}@* but our system default is *.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegatio

[jira] [Updated] (OOZIE-3723) Oozie service permanently caches workflow-supplied FileSystem connectivity configuration properties for obtaining HDFS Credentials until restarted



 [ 
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Olson updated OOZIE-3723:

Description: 
We recently have encountered two separate issues that both required an Oozie 
service restart to resolve. In both situations it was apparent that incorrect 
workflow-supplied configuration properties related to remote FileSystem 
connectivity to support obtaining HDFS credentials for remote clusters (via 
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some 
kind of cache in the Oozie service or underlying Hadoop code. These cached 
values are superseding corrected values after the workflow configuration is 
fixed, giving us no known way to fix the problem without restarting the Oozie 
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files 
where Oozie is running had not been updated since the prior restart, so not a 
basic case of stale configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to */*@* 
but our system default is *.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at 
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(D

[jira] [Updated] (OOZIE-3723) Oozie service permanently caches workflow-supplied FileSystem connectivity configuration properties for obtaining HDFS Credentials until restarted



 [ 
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Olson updated OOZIE-3723:

Description: 
We recently have encountered two separate issues that both required an Oozie 
service restart to resolve. In both situations it was apparent that incorrect 
workflow-supplied configuration properties related to remote FileSystem 
connectivity to support obtaining HDFS credentials for remote clusters (via 
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some 
kind of cache in the Oozie service or underlying Hadoop code. These cached 
values are superseding corrected values after the workflow configuration is 
fixed, giving us no known way to fix the problem without restarting the Oozie 
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files 
where Oozie is running had not been updated since the prior restart, so not a 
basic case of stale configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to 
{{*/*@*}} but our system default is*.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at 
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(D

[jira] [Updated] (OOZIE-3723) Oozie service permanently caches workflow-supplied FileSystem connectivity configuration properties for obtaining HDFS Credentials until restarted



 [ 
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Olson updated OOZIE-3723:

Description: 
We recently have encountered two separate issues that both required an Oozie 
service restart to resolve. In both situations it was apparent that incorrect 
workflow-supplied configuration properties related to remote FileSystem 
connectivity to support obtaining HDFS credentials for remote clusters (via 
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some 
kind of cache in the Oozie service or underlying Hadoop code. These cached 
values are superseding corrected values after the workflow configuration is 
fixed, giving us no known way to fix the problem without restarting the Oozie 
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files 
where Oozie is running had not been updated since the prior restart, so not a 
basic case of stale configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to 
{{{}{*}*/*{*}@*{*}{*}{}}}, but our system default is {{{}*{}}}.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at 
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(D

[jira] [Updated] (OOZIE-3723) Oozie service permanently caches workflow-supplied FileSystem connectivity configuration properties for obtaining HDFS Credentials until restarted



 [ 
https://issues.apache.org/jira/browse/OOZIE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Olson updated OOZIE-3723:

Description: 
We recently have encountered two separate issues that both required an Oozie 
service restart to resolve. In both situations it was apparent that incorrect 
workflow-supplied configuration properties related to remote FileSystem 
connectivity to support obtaining HDFS credentials for remote clusters (via 
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some 
kind of cache in the Oozie service or underlying Hadoop code. These cached 
values are superseding corrected values after the workflow configuration is 
fixed, giving us no known way to fix the problem without restarting the Oozie 
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files 
where Oozie is running had not been updated since the prior restart, so not a 
basic case of stale configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to 
\*/\*@\* but our system default is\*.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:734)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1996)
at 
org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(D

[jira] [Created] (OOZIE-3723) Oozie service permanently caches workflow-supplied FileSystem connectivity configuration properties for obtaining HDFS Credentials until restarted

Andrew Olson created OOZIE-3723:
---

 Summary: Oozie service permanently caches workflow-supplied 
FileSystem connectivity configuration properties for obtaining HDFS Credentials 
until restarted
 Key: OOZIE-3723
 URL: https://issues.apache.org/jira/browse/OOZIE-3723
 Project: Oozie
  Issue Type: Bug
  Components: workflow
Reporter: Andrew Olson


We recently have encountered two separate issues that both required an Oozie 
service restart to resolve. In both situations it was apparent that incorrect 
workflow-supplied configuration properties related to remote FileSystem 
connectivity to support obtaining HDFS credentials for remote clusters (via 
{{{}mapreduce.job.hdfs-servers{}}}) are being retained permanently within some 
kind of cache in the Oozie service or underlying Hadoop code. These cached 
values are superseding corrected values after the workflow configuration is 
fixed, giving us no known way to fix the problem without restarting the Oozie 
service. We confirmed that the {{hdfs-site.xml}} and {{oozie-site.xml}} files 
where Oozie is running had not been updated since the prior restart, so not a 
basic case of stale configuration.

We are running Oozie version 5.2.0 in this environment.

Complete stack traces are provided below.

Issue 1:

A workflow incorrectly set {{dfs.namenode.kerberos.principal.pattern}} to our 
{{{}*/*@*{}}}, but our system default is {{{}*{}}}.
{noformat}
org.apache.oozie.action.ActionExecutorException: JA009: Couldn't set up IO 
streams: java.lang.IllegalArgumentException: Server has invalid Kerberos 
principal: nn/hostname.some.domain@kerberos.realm.com, doesn't match the 
pattern: '*/*@*'
at 
org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:463)
at 
org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:443)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1134)
at 
org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1644)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:243)
at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:68)
at org.apache.oozie.command.XCommand.call(XCommand.java:290)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:363)
at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:210)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: Couldn't set up IO streams: 
java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
nn/hostname.some.domain@kerberos.realm.com, doesn't match the pattern: 
'*/*@*'
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:894)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:414)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1677)
at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1455)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129)
at com.sun.proxy.$Proxy35.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:1134)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy36.getDelegationToken(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSClient.getDelega

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-08-03 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870681#comment-17870681
 ] 

Dénes Bodó commented on OOZIE-3719:
---

I was not able to fix the build issues in Jenkins nor I will have the bandwidth 
to work on it in the near future. Asked help in 
[priv...@oozie.apache.org|mailto:private@oozie.apache] ; no response so far. 
Best effort I will try to find someone who can do the proper testing or fix 
build issues.

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> OOZIE-3719-007.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-07-17 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866680#comment-17866680
 ] 

János Makai commented on OOZIE-3719:


The latest patch looks good to me, thanks [~dionusos]!

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> OOZIE-3719-007.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-07-16 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866382#comment-17866382
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> OOZIE-3719-007.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Improve coordinator scope range checking

2024-07-16 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Attachment: OOZIE-3719-007.patch

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> OOZIE-3719-007.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-07-15 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866243#comment-17866243
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Improve coordinator scope range checking

2024-07-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Attachment: OOZIE-3719-006.patch

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, OOZIE-3719-006.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-07-15 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866174#comment-17866174
 ] 

Hadoop QA commented on OOZIE-3719:
--


Testing JIRA OOZIE-3719

Cleaning local git workspace



{color:red}-1{color} Patch failed to apply to head of branch




> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Improve coordinator scope range checking

2024-07-15 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866172#comment-17866172
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Improve coordinator scope range checking

2024-07-15 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Attachment: OOZIE-3719-005.patch

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, OOZIE-3719-005.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Improve coordinator scope range checking

2024-07-12 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Summary: Improve coordinator scope range checking  (was: Apache Oozie Regex 
Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting 
Access for Intended Users)

> Improve coordinator scope range checking
> 
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-09 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864260#comment-17864260
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-09 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Attachment: OOZIE-3719-003.patch

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> OOZIE-3719-003.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-09 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864062#comment-17864062
 ] 

János Makai edited comment on OOZIE-3719 at 7/9/24 8:30 AM:


Looks like the *PreCommit-OOZIE-Build* is failing for the recent patch(es) but 
this seems like an +unrelated issue+ to the change\{+}
[https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/219/consoleFull]

Other than this the patch looks good so far, I'm waiting for the corresponding 
unit tests to be created.

Thanks [~dionusos] 


was (Author: jmakai):
Looks like the *PreCommit-OOZIE-Build* is failing for the recent patch(es) but 
this seems like an +unrelated issue+ to the change{+}
[https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/219/consoleFull]

{+}Other than this the patch looks good so far, I'm waiting for the 
corresponding unit tests to be created.

Thanks [~dionusos] 

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-09 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864062#comment-17864062
 ] 

János Makai commented on OOZIE-3719:


Looks like the *PreCommit-OOZIE-Build* is failing for the recent patch(es) but 
this seems like an +unrelated issue+ to the change{+}
[https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/219/consoleFull]

{+}Other than this the patch looks good so far, I'm waiting for the 
corresponding unit tests to be created.

Thanks [~dionusos] 

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-09 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864035#comment-17864035
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-08 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863825#comment-17863825
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-08 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17863824#comment-17863824
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-07-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Attachment: OOZIE-3719-002.patch

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, OOZIE-3719-002.patch, 
> image-2023-09-15-02-47-52-819.png, image-2023-09-15-02-49-14-531.png, 
> image-2023-09-15-02-52-09-320.png, oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-02-14 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3719:
--
Fix Version/s: (was: 5.3.0)

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-02-14 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817263#comment-17817263
 ] 

Dénes Bodó commented on OOZIE-3719:
---

[~SanjayKumarSahu] 

The [Jenkins job|https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/216/] 
cannot apply the uploaded patch using this command:
{code:bash}
git apply --check -v -p0 < OOZIE-3719-001.patch{code}
 

Could you please format your patch according to this description?

[https://cwiki.apache.org/confluence/display/OOZIE/How+To+Contribute]
{code:bash}
git diff --no-prefix {code}
Thank you

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-02-14 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817246#comment-17817246
 ] 

Hadoop QA commented on OOZIE-3719:
--


Testing JIRA OOZIE-3719

Cleaning local git workspace



{color:red}-1{color} Patch failed to apply to head of branch




> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2024-02-14 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817240#comment-17817240
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3722) Workflow actions can stuck in RUNNING state when DB connections are killed on the DB side

2024-02-14 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817237#comment-17817237
 ] 

Dénes Bodó commented on OOZIE-3722:
---

If anybody, who found this ticket, has any suggestion, solution or question, 
please do not hesitate to ask here or on any Oozie mailing lists.

> Workflow actions can stuck in RUNNING state when DB connections are killed on 
> the DB side
> -
>
> Key: OOZIE-3722
> URL: https://issues.apache.org/jira/browse/OOZIE-3722
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Dénes Bodó
>Assignee: Dénes Bodó
>Priority: Critical
>
> Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool 
> 1.5.4. These are ancient versions, I know.
> h1. Description
> The issue is that when due to some network issues or "maintenance work" on 
> the DB side (especially PostgreSQL) which causes the DB connection to be 
> closed, it results exhausted Pool on the client side. Many threads are 
> waiting at this point:
> {noformat}
> "pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603 
> waiting on condition [0x00030f3e7000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x00066aca8e70> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>   at 
> org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324)
>  {noformat}
> According to my observation this is because the JDBC driver does not get 
> closed on the client side, nor the abstract DBCP connection 
> _org.apache.commons.dbcp2.PoolableConnection_ .
>  
> This issue can cause workflow actions stuck in RUNNING state because the 
> thread which would update the DB after XActionExecutor.check() doesn't get a 
> connection causing the thread stuck infinitely.
>  
> h1. Workaround
> Restarts Oozie and/or fix the DB/network issue.
> h1. Repro
> (Un)Fortunately I can reproduce the issue using the latest and greatest 
> commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2.
> I've just created a Java application to reproduce the issue: 
> [https://github.com/dionusos/pool_exhausted_repro] . See README.md for 
> detailed repro steps.
>  
> DBCP-595 was created to ask for help from DBCP/Pool teams. I am working on 
> the case to provide them the necessary information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OOZIE-3722) Workflow actions can stuck in RUNNING state when DB connections are killed on the DB side

2024-02-13 Thread Jira

Dénes Bodó created OOZIE-3722:
-

Summary: Workflow actions can stuck in RUNNING state when DB
connections are killed on the DB side
Key: OOZIE-3722
URL: https://issues.apache.org/jira/browse/OOZIE-3722
Project: Oozie
Issue Type: Bug
Components: core
Affects Versions: 5.2.1
Reporter: Dénes Bodó
Assignee: Dénes Bodó

Apache Oozie 5.2.1 uses OpenJPA 2.4.2 and commons-dbcp 1.4 and commons-pool
1.5.4. These are ancient versions, I know.
h1. Description

The issue is that when due to some network issues or "maintenance work" on the
DB side (especially PostgreSQL) which causes the DB connection to be closed, it
results exhausted Pool on the client side. Many threads are waiting at this
point:
{noformat}
"pool-2-thread-4" #20 prio=5 os_prio=31 tid=0x7faf7903b800 nid=0x8603
waiting on condition [0x00030f3e7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00066aca8e70> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:1324)
{noformat}
According to my observation this is because the JDBC driver does not get closed
on the client side, nor the abstract DBCP connection
_org.apache.commons.dbcp2.PoolableConnection_ .

This issue can cause workflow actions stuck in RUNNING state because the thread
which would update the DB after XActionExecutor.check() doesn't get a
connection causing the thread stuck infinitely.

h1. Workaround

Restarts Oozie and/or fix the DB/network issue.
h1. Repro

(Un)Fortunately I can reproduce the issue using the latest and greatest
commons-dbcp 2.11.0 and commons-pool 2.12.0 along with OpenJPA 3.2.2.

I've just created a Java application to reproduce the issue:
[https://github.com/dionusos/pool_exhausted_repro] . See README.md for detailed
repro steps.

DBCP-595 was created to ask for help from DBCP/Pool teams. I am working on the
case to provide them the necessary information.

--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3721) Subsidiaries freeze in the status of "RUNNING" during a high load on the cluster

2024-01-29 Thread Cecily Myles (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cecily Myles updated OOZIE-3721:

Description: 
When my cluster is loaded, I am faced with the problem of hanging subsidiaries 
in the status of "RUNNING". I get such a mistake when working with the HIVE 
tables. But also, I managed to reproduce the problem, launching the usual 
calculation of the number of pi in many subsidiaries, imitating the load.

I launch an Oozie workflow with the following structure:
{code:java}
-- Oozie workflow
--> subworkflow_1
-- fork_1
-- fork_2
-- ...
-- fork_n
--> subworkflow_2
-- fork_1
-- fork_2
-- ...
-- fork_n {code}
One of the fork have status "RUNNING" but if you open this fork, then it has 
"SUCCESS" status.

Parent workflow:
{code:java}
Job ID : 0061971-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf/job
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-25 15:55 GMT
Started       : 2024-01-25 15:55 GMT
Last Modified : 2024-01-30 06:24 GMT
Ended         : -
CoordAction ID: -Actions
-
ID                                                       Status    Ext ID       
          Ext Status Err Code
-
0061971-240125161152217-oozie-oozi-W@:start:             OK        -            
          OK         -
-
0061971-240125161152217-oozie-oozi-W@fork                OK        -            
          OK         -
-
0061971-240125161152217-oozie-oozi-W@fork7               OK        
0067643-240125161152217-oozie-oozi-WSUCCEEDED  -
-
0061971-240125161152217-oozie-oozi-W@fork9               OK        
0067640-240125161152217-oozie-oozi-WSUCCEEDED  -
-
0061971-240125161152217-oozie-oozi-W@fork10              RUNNING   
0067641-240125161152217-oozie-oozi-WRUNNING    -
-
0061971-240125161152217-oozie-oozi-W@fork5               OK        
0067645-240125161152217-oozie-oozi-WSUCCEEDED  -
-
 {code}
Running subworkflow:
{code:java}
Job ID : 0067641-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-26 04:20 GMT
Started       : 2024-01-26 04:20 GMT
Last Modified : 2024-01-26 08:23 GMT
Ended         : -
CoordAction ID: 0061971-240125161152217-oozie-oozi-WActions
-
ID                                                       Status    Ext ID       
          Ext Status Err Code
-
0067641-240125161152217-oozie-oozi-W@:start:             OK        -            
          OK         -
-
0067641-240125161152217-oozie-oozi-W@fork                OK        -            
          OK         -
-
0067641-240125161152217-oozie-oozi-W@fork21              RUNNING   
application_1706187939089_147514RUNNING    -
-
0067641-240125161152217-oozie-oozi-W@fork22              RUNNING   
app

[jira] [Updated] (OOZIE-3721) Subsidiaries freeze in the status of "RUNNING" during a high load on the cluster

2024-01-29 Thread Cecily Myles (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cecily Myles updated OOZIE-3721:

Description: 
When my cluster is loaded, I am faced with the problem of hanging subsidiaries 
in the status of "RUNNING". I get such a mistake when working with the HIVE 
tables. But also, I managed to reproduce the problem, launching the usual 
calculation of the number of pi in many subsidiaries, imitating the load.

I launch an Oozie workflow with the following structure:
{code:java}
-- Oozie workflow
--> subworkflow_1
-- fork_1
-- fork_2
-- ...
-- fork_n
--> subworkflow_2
-- fork_1
-- fork_2
-- ...
-- fork_n {code}
One of the fork have status "RUNNING" but if you open this fork, then it has 
"SUCCESS" status.

Parent workflow:
{code:java}
Job ID : 0061971-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf/job
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-25 15:55 GMT
Started       : 2024-01-25 15:55 GMT
Last Modified : 2024-01-30 06:24 GMT
Ended         : -
CoordAction ID: -Actions

ID                                                       Status    Ext ID       
          Ext Status Err Code

0061971-240125161152217-oozie-oozi-W@:start:             OK        -            
          OK         -

0061971-240125161152217-oozie-oozi-W@fork                OK        -            
          OK         -

0061971-240125161152217-oozie-oozi-W@fork7               OK        
0067643-240125161152217-oozie-oozi-WSUCCEEDED  -

0061971-240125161152217-oozie-oozi-W@fork9               OK        
0067640-240125161152217-oozie-oozi-WSUCCEEDED  -

0061971-240125161152217-oozie-oozi-W@fork10              RUNNING   
0067641-240125161152217-oozie-oozi-WRUNNING    -

0061971-240125161152217-oozie-oozi-W@fork5               OK        
0067645-240125161152217-oozie-oozi-WSUCCEEDED  -

 {code}
Running subworkflow:
{code:java}
Job ID : 0067641-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-26 04:20 GMT
Started       : 2024-01-26 04:20 GMT
Last Modified : 2024-01-26 08:23 GMT
Ended         : -
CoordAction ID: 0061971-240125161152217-oozie-oozi-WActions

ID                                                       Status    Ext ID       
          Ext Status Err Code

0067641-240125161152217-oozie-oozi-W@:start:             OK        -            
          OK         -

0067641-240125161152217-oozie-oozi-W@fork                OK        -            
          OK         -

0067641-240125161152217-oozie-oozi-W@fork21              RUNNING   
application_1706187939089_147514RUNNING    -

[jira] [Created] (OOZIE-3721) Subsidiaries freeze in the status of "RUNNING" during a high load on the cluster

2024-01-29 Thread Cecily Myles (Jira)

Cecily Myles created OOZIE-3721:
---

 Summary: Subsidiaries freeze in the status of "RUNNING" during a 
high load on the cluster
 Key: OOZIE-3721
 URL: https://issues.apache.org/jira/browse/OOZIE-3721
 Project: Oozie
  Issue Type: Bug
  Components: core
Affects Versions: 5.2.0
Reporter: Cecily Myles


When my cluster is loaded, I am faced with the problem of hanging subsidiaries 
in the status of "RUNNING". I get such a mistake when working with the HIVE 
tables. But also, I managed to reproduce the problem, launching the usual 
calculation of the number of pi in many subsidiaries, imitating the load.

I launch an Oozie workflow with the following structure:
{code:java}
-- Oozie workflow
--> subworkflow_1
-- fork_1
-- fork_2
-- ...
-- fork_n
--> subworkflow_2
-- fork_1
-- fork_2
-- ...
-- fork_n {code}
One of the fork have status "RUNNING" but if you open this fork, then it has 
"SUCCESS" status.

Parent workflow:
{code:java}
Job ID : 0061971-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf/job
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-25 15:55 GMT
Started       : 2024-01-25 15:55 GMT
Last Modified : 2024-01-30 06:24 GMT
Ended         : -
CoordAction ID: -Actions

ID                                                                            
Status    Ext ID                 Ext Status Err Code

0061971-240125161152217-oozie-oozi-W@:start:                                  
OK        -                      OK         -

0061971-240125161152217-oozie-oozi-W@fork                                     
OK        -                      OK         -

0061971-240125161152217-oozie-oozi-W@fork7                                    
OK        0067643-240125161152217-oozie-oozi-WSUCCEEDED  -

0061971-240125161152217-oozie-oozi-W@fork9                                    
OK        0067640-240125161152217-oozie-oozi-WSUCCEEDED  -

0061971-240125161152217-oozie-oozi-W@fork10                                   
RUNNING   0067641-240125161152217-oozie-oozi-WRUNNING    -

0061971-240125161152217-oozie-oozi-W@fork5                                    
OK        0067645-240125161152217-oozie-oozi-WSUCCEEDED  -

 {code}
Running subworkflow:
{code:java}
Job ID : 0067641-240125161152217-oozie-oozi-W

Workflow Name : test-subworkflow
App Path      : hdfs://mycluster:8020/user/cecyl/subwf
Status        : RUNNING
Run           : 0
User          : cecyl
Group         : -
Created       : 2024-01-26 04:20 GMT
Started       : 2024-01-26 04:20 GMT
Last Modified : 2024-01-26 08:23 GMT
Ended         : -
CoordAction ID: 0061971-240125161152217-oozie-oozi-WActions

ID                                                                            
Status    Ext ID                 Ext Status Err Code

0067641-240125161152217-oozie-oozi-W@:start:                                  
OK        -                      OK         -

0067641-240125161152217-oozie-oozi-W@fork

[jira] [Commented] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487

2024-01-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811849#comment-17811849
 ] 

Hadoop QA commented on OOZIE-3720:
--


Testing JIRA OOZIE-3720

Cleaning local git workspace



{color:red}-1{color} Patch failed to apply to head of branch




> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
> Attachments: OOZIE-3720-001.patch
>
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487

2024-01-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811844#comment-17811844
 ] 

Hadoop QA commented on OOZIE-3720:
--

PreCommit-OOZIE-Build started


> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
> Attachments: OOZIE-3720-001.patch
>
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487



 [ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated OOZIE-3720:
--
Attachment: (was: OOZIE-3720-001.patch)

> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
> Attachments: OOZIE-3720-001.patch
>
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487



 [ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated OOZIE-3720:
--
Attachment: (was: OOZIE-3720-001-1.patch)

> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
> Attachments: OOZIE-3720-001.patch
>
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487



 [ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated OOZIE-3720:
--
Attachment: OOZIE-3720-001.patch

> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
> Attachments: OOZIE-3720-001.patch
>
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487



 [ 
https://issues.apache.org/jira/browse/OOZIE-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anmol Sundaram updated OOZIE-3720:
--
Description: 
The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
CVE-2023-44487. As such, we should see if we can upgrade jetty to 
9.4.53.v20231009 in Oozie.

 

PR - https://github.com/apache/oozie/pull/93/files

  was:The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
CVE-2023-44487. As such, we should see if we can upgrade jetty to 
9.4.53.v20231009 in Oozie.


> Upgrade jetty to 9.4.53 due to CVE-2023-44487
> -
>
> Key: OOZIE-3720
> URL: https://issues.apache.org/jira/browse/OOZIE-3720
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Anmol Sundaram
>Priority: Major
>
> The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
> CVE-2023-44487. As such, we should see if we can upgrade jetty to 
> 9.4.53.v20231009 in Oozie.
>  
> PR - https://github.com/apache/oozie/pull/93/files



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OOZIE-3720) Upgrade jetty to 9.4.53 due to CVE-2023-44487

Anmol Sundaram created OOZIE-3720:
-

 Summary: Upgrade jetty to 9.4.53 due to CVE-2023-44487
 Key: OOZIE-3720
 URL: https://issues.apache.org/jira/browse/OOZIE-3720
 Project: Oozie
  Issue Type: Improvement
Reporter: Anmol Sundaram


The latest version for Jetty 9.x is 9.4.53.v20231009. This also fixes the 
CVE-2023-44487. As such, we should see if we can upgrade jetty to 
9.4.53.v20231009 in Oozie.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-12-08 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794659#comment-17794659
 ] 

Hadoop QA commented on OOZIE-3719:
--


Testing JIRA OOZIE-3719

Cleaning local git workspace



{color:red}-1{color} Patch failed to apply to head of branch




> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-12-08 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794655#comment-17794655
 ] 

Hadoop QA commented on OOZIE-3719:
--

PreCommit-OOZIE-Build started


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Fix For: 5.3.0
>
> Attachments: OOZIE-3719-001.patch, image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-12-08 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794647#comment-17794647
 ] 

Dénes Bodó commented on OOZIE-3719:
---

[~SanjayKumarSahu] Please upload your patch with the following name 
"OOZIE-3719-001.patch" and then push the "Submit Patch" button to start the 
automated build and tests.

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
>     URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-12-06 Thread Sanjay Kumar Sahu (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Kumar Sahu updated OOZIE-3719:
-
Attachment: oozie3719.patch

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png, 
> oozie3719.patch
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-12-06 Thread Sanjay Kumar Sahu (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17793748#comment-17793748
 ] 

Sanjay Kumar Sahu commented on OOZIE-3719:
--

PR link : https://github.com/apache/oozie/pull/92

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó reassigned OOZIE-3718:
-

Assignee: Dénes Bodó

> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Assignee: Dénes Bodó
>Priority: Major
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó reassigned OOZIE-3718:
-

Assignee: (was: Dénes Bodó)

> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Priority: Major
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó resolved OOZIE-3718.
---
Fix Version/s: 5.3.0
   Resolution: Fixed

Thanks [~NikhilDaf] for the fix. Your change is committed to master.

> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Priority: Major
> Fix For: 5.3.0
>
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-25 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779374#comment-17779374
 ] 

ASF subversion and git services commented on OOZIE-3718:


Commit 318fac5391eb1b7e9b868ee6fb64f4e9c49850cb in oozie's branch 
refs/heads/master from Denes Bodo
[ https://gitbox.apache.org/repos/asf?p=oozie.git;h=318fac539 ]

OOZIE-3718 Improve Oozie Web UI filtering (NikhilDaf via dionusos)


> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Priority: Major
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-23 Thread Nikhil Daf (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778581#comment-17778581
 ] 

Nikhil Daf commented on OOZIE-3718:
---

[~dionusos] As this is a security issue, I have been informed not to add the 
details here. I have communicated the info like the repro steps and the patch 
to the apache security team. They will release the fix soon.  

> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Priority: Major
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-10-17 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776227#comment-17776227
 ] 

Dénes Bodó commented on OOZIE-3718:
---

[~NikhilDaf] could you please attach your patch to the Jira?

Thank you.

> Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability
> 
>
> Key: OOZIE-3718
> URL: https://issues.apache.org/jira/browse/OOZIE-3718
> Project: Oozie
>  Issue Type: Bug
>Reporter: Nikhil Daf
>Priority: Major
>
> [CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
> Spoofing 
> Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-09-18 Thread Kinga Marton (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kinga Marton reassigned OOZIE-3719:
---

Assignee: Sanjay Kumar Sahu

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Assignee: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-09-14 Thread Sanjay Kumar Sahu (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Kumar Sahu updated OOZIE-3719:
-
Description: 
!image-2023-09-15-02-47-52-819.png!

 

Looking further into the code focusing on the action and type query strings.
We can see that the filter variable is getting its value from the 
requestsParameters .
once the Filter parameter is being populated, an If loop checking whether Scope 
and Type are not Null and next
the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
the action query string).

 

Next the values of logRetrievalScope gets split by , and entering the the if 
loop.
In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
... } ), an attacker could potentially
send a specially crafted request with a massive range, such as "1-100". 
This would create a for loop
iterating and adding that many actions to the actionSet , consuming CPU and 
memory resources.
Though there is a subsequent check against maxNumActionsForLog , this check 
only happens after all the iterations,
allowing an attacker to consume resources before this check is made -

 

!image-2023-09-15-02-52-09-320.png!

 

 

  was:
!image-2023-09-15-02-47-52-819.png!

 

Looking further into the code focusing on the action and type query strings.
We can see that the filter variable is getting its value from the 
requestsParameters .
once the Filter parameter is being populated, an If loop checking whether Scope 
and Type are not Null and next
the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
the action query string).

 

Next the values of logRetrievalScope gets split by , and entering the the if 
loop.
In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
... } ), an attacker could potentially
send a specially crafted request with a massive range, such as "1-100". 
This would create a for loop
iterating and adding that many actions to the actionSet , consuming CPU and 
memory resources.
Though there is a subsequent check against maxNumActionsForLog , this check 
only happens after all the iterations,
allowing an attacker to consume resources before this check is made -

 

!image-2023-09-15-02-50-26-331.png!

 

 


> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-52-09-320.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-09-14 Thread Sanjay Kumar Sahu (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Kumar Sahu updated OOZIE-3719:
-
Attachment: image-2023-09-15-02-52-09-320.png

> Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege 
> Users Disrupting Access for Intended Users
> --
>
> Key: OOZIE-3719
> URL: https://issues.apache.org/jira/browse/OOZIE-3719
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 5.2.1
>Reporter: Sanjay Kumar Sahu
>Priority: Major
> Attachments: image-2023-09-15-02-47-52-819.png, 
> image-2023-09-15-02-49-14-531.png, image-2023-09-15-02-52-09-320.png
>
>
> !image-2023-09-15-02-47-52-819.png!
>  
> Looking further into the code focusing on the action and type query strings.
> We can see that the filter variable is getting its value from the 
> requestsParameters .
> once the Filter parameter is being populated, an If loop checking whether 
> Scope and Type are not Null and next
> the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
> the action query string).
>  
> Next the values of logRetrievalScope gets split by , and entering the the if 
> loop.
> In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
> ... } ), an attacker could potentially
> send a specially crafted request with a massive range, such as "1-100". 
> This would create a for loop
> iterating and adding that many actions to the actionSet , consuming CPU and 
> memory resources.
> Though there is a subsequent check against maxNumActionsForLog , this check 
> only happens after all the iterations,
> allowing an attacker to consume resources before this check is made -
>  
> !image-2023-09-15-02-50-26-331.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OOZIE-3719) Apache Oozie Regex Denial of Service (ReDoS) Vulnerability by Low Privilege Users Disrupting Access for Intended Users

2023-09-14 Thread Sanjay Kumar Sahu (Jira)

Sanjay Kumar Sahu created OOZIE-3719:


 Summary: Apache Oozie Regex Denial of Service (ReDoS) 
Vulnerability by Low Privilege Users Disrupting Access for Intended Users
 Key: OOZIE-3719
 URL: https://issues.apache.org/jira/browse/OOZIE-3719
 Project: Oozie
  Issue Type: Bug
  Components: core
Affects Versions: 5.2.1
Reporter: Sanjay Kumar Sahu
 Attachments: image-2023-09-15-02-47-52-819.png, 
image-2023-09-15-02-49-14-531.png

!image-2023-09-15-02-47-52-819.png!

 

Looking further into the code focusing on the action and type query strings.
We can see that the filter variable is getting its value from the 
requestsParameters .
once the Filter parameter is being populated, an If loop checking whether Scope 
and Type are not Null and next
the code checks the logRetrievalType is equal to the JOB_LOG_ACTION (which is 
the action query string).

 

Next the values of logRetrievalScope gets split by , and entering the the if 
loop.
In the block where ranges of actions are processed ( if (s.contains("-")) \{ 
... } ), an attacker could potentially
send a specially crafted request with a massive range, such as "1-100". 
This would create a for loop
iterating and adding that many actions to the actionSet , consuming CPU and 
memory resources.
Though there is a subsequent check against maxNumActionsForLog , this check 
only happens after all the iterations,
allowing an attacker to consume resources before this check is made -

 

!image-2023-09-15-02-50-26-331.png!

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OOZIE-3718) Fix CVE-2023-36877 Azure Apache Oozie Spoofing Vulnerability

2023-09-12 Thread Nikhil Daf (Jira)

Nikhil Daf created OOZIE-3718:
-

 Summary: Fix CVE-2023-36877 Azure Apache Oozie Spoofing 
Vulnerability
 Key: OOZIE-3718
 URL: https://issues.apache.org/jira/browse/OOZIE-3718
 Project: Oozie
  Issue Type: Bug
Reporter: Nikhil Daf


[CVE-2023-36877 - Security Update Guide - Microsoft - Azure Apache Oozie 
Spoofing 
Vulnerability|https://msrc.microsoft.com/update-guide/vulnerability/CVE-2023-36877]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-08-07 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751691#comment-17751691
 ] 

ASF subversion and git services commented on OOZIE-3717:


Commit 3c614c74cb5cb8897a2f95334b5e467227edf740 in oozie's branch 
refs/heads/master from Denes Bodo
[ https://gitbox.apache.org/repos/asf?p=oozie.git;h=3c614c74c ]

OOZIE-3717 When fork actions parallel submit, becasue ForkedActionStartXCommand 
and ActionStartXCommand has the same name, so ForkedActionStartXCommand would 
be lost, and cause deadlock (chenhd via dionusos)


> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(),

[jira] [Comment Edited] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cau

2023-08-07 Thread chenhaodan (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751270#comment-17751270
 ] 

chenhaodan edited comment on OOZIE-3717 at 8/7/23 8:20 AM:
---

[~dionusos] I am sorry for that. I had fixed in [^OOZIE-3717-003.patch]

Thanks for your time.


was (Author: chenhd):
[~dionusos] I am sorry for that. I had fixed them in [^OOZIE-3717-003.patch]

Thanks for your time.

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeuein

[jira] [Comment Edited] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cau

2023-08-07 Thread chenhaodan (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751270#comment-17751270
 ] 

chenhaodan edited comment on OOZIE-3717 at 8/7/23 8:20 AM:
---

[~dionusos] I am sorry for that. I had fixed them in [^OOZIE-3717-003.patch]

Thanks for your time.


was (Author: chenhd):
[~dionusos] I am sorry for that. I has fixed them in [^OOZIE-3717-003.patch] 
[|https://issues.apache.org/jira/secure/DeleteAttachment!default.jspa?id=13545401&deleteAttachmentId=13061935&from=issue]

Thanks for your time.

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
>     URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
>

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-08-05 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751281#comment-17751281
 ] 

Hadoop QA commented on OOZIE-3717:
--


Testing JIRA OOZIE-3717

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:green}+1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:green}+1{color} the patch does not introduce any star imports
.{color:green}+1{color} the patch does not introduce any line longer than 
132
.{color:green}+1{color} the patch adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} Javadoc generation succeeded with the patch
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:red}-1{color} There are [3] new bugs found below threshold in total that 
must be fixed.
.{color:green}+1{color} There are no new bugs found in [examples].
.{color:green}+1{color} There are no new bugs found in 
[fluent-job/fluent-job-api].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive2].
.{color:green}+1{color} There are no new bugs found in [sharelib/git].
.{color:green}+1{color} There are no new bugs found in [sharelib/distcp].
.{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
.{color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
.{color:green}+1{color} There are no new bugs found in [sharelib/spark].
.{color:green}+1{color} There are no new bugs found in [sharelib/oozie].
.{color:green}+1{color} There are no new bugs found in [sharelib/pig].
.{color:green}+1{color} There are no new bugs found in [sharelib/streaming].
.{color:green}+1{color} There are no new bugs found in [server].
.{color:green}+1{color} There are no new bugs found in [docs].
.{color:green}+1{color} There are no new bugs found in [webapp].
.{color:red}-1{color} There are [3] new bugs found below threshold in 
[core] that must be fixed.
.You can find the SpotBugs diff here (look for the red and orange ones): 
core/findbugs-new.html
.The most important SpotBugs errors are:
.At BulkJPAExecutor.java:[line 206]: This use of 
javax/persistence/EntityManager.createQuery(Ljava/lang/String;)Ljavax/persistence/Query;
 can be vulnerable to SQL/JPQL injection
.At BulkJPAExecutor.java:[line 176]: At BulkJPAExecutor.java:[line 175]
.At BulkJPAExecutor.java:[line 205]: At BulkJPAExecutor.java:[line 199]
.Unsafe comparison of hash that are susceptible to timing attack: At 
BulkJPAExecutor.java:[line 206]
.At ShareLibService.java:[line 689]: At ShareLibService.java:[line 695]
.{color:green}+1{color} There are no new bugs found in [tools].
.{color:green}+1{color} There are no new bugs found in [client].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 3262
.{color:orange}Tests failed at first run:{color}
TestCoordActionInputCheckXCommand#testCoordActionInputCheckXCommandUniqueness
.For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 
{color:green}+1 MODERNIZER{color}


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

. https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/213/



> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chen

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-08-04 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751273#comment-17751273
 ] 

Hadoop QA commented on OOZIE-3717:
--

PreCommit-OOZIE-Build started


> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callable
> .getType(), CONCURRENCY_DELAY);
> setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
>

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-08-04 Thread chenhaodan (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751270#comment-17751270
 ] 

chenhaodan commented on OOZIE-3717:
---

[~dionusos] I am sorry for that. I has fixed them in [^OOZIE-3717-003.patch] 
[|https://issues.apache.org/jira/secure/DeleteAttachment!default.jspa?id=13545401&deleteAttachmentId=13061935&from=issue]

Thanks for your time.

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
>     URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callab

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead

2023-08-04 Thread chenhaodan (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Attachment: OOZIE-3717-003.patch

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch, 
> OOZIE-3717-003.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callable
> .getType(), CONCURRENCY_DELAY);
> setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> incrCounter(callable.getType() + "#exceeded.concur

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-08-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751026#comment-17751026
 ] 

Dénes Bodó commented on OOZIE-3717:
---

[~chenhd] Your change looks good to me overall.

However, may you please fix these errors Jenkins reported and one minor typo?
{noformat}
. -1 the patch contains 1 star import(s)
. -1 the patch contains 4 line(s) longer than 132 characters 

it would be lose. => it would be lost.
private XLog log => can be final{noformat}
Removing the unused imports from the affected classes are also very welcome.

 

Let me start additional tests on your change (~ 1 day). If those passes and the 
above issues are fixed I think we are good to merge.

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_CO

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead

2023-08-04 Thread Jira



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dénes Bodó updated OOZIE-3717:
--
Fix Version/s: (was: trunk)

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callable
> .getType(), CONCURRENCY_DELAY);
> setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> incrCounter(callable.getType() + "#exceeded.concurrency", 1);
> }

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead

2023-07-31 Thread chenhaodan (Jira)



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
when fork actions parallel submit will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Thread 1 and Thread 2 execute CallableWrapper's execute function order like:
  1. Thread 1 execute removeFromUniqueCallables; 
  2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
uniqueCallables;
  3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
filterDuplicates() function found a same name XCommand in uniqueCallables, so 
skip add to queue;

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
ForkedActionStartXCommand would be lost(never execute), and the thread that 
fork actions parallel submit block at CallableQueueService.blockingWait(). 
{code}
 
*CallableWrapper's code*
{code:java}
public class CallableWrapper extends PriorityDelayQueue.QueueElement 
implements Runnable, Callable {
private Instrumentation.Cron cron;

public void run() {
XCallable callable = null;
try {
removeFromUniqueCallables();
if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
[{1}]ms delay", getElement().getType(),
SAFE_MODE_DELAY);
setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
return;
}
callable = getElement();
if (callableBegin(callable)) {
cron.stop();
addInQueueCron(cron);
XLog log = XLog.getLog(getClass());
log.trace("executing callable [{0}]", callable.getName());

try {
//FutureTask.run() will invoke cllable.call()
super.run();
incrCounter(INSTR_EXECUTED_COUNTER, 1);
log.trace("executed callable [{0}]", callable.getName());
}
catch (Exception ex) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", 
callable.getName(), ex.getMessage(), ex);
}
}
else {
log.warn("max concurrency for callable [{0}] exceeded, 
requeueing with [{1}]ms delay", callable
.getType(), CONCURRENCY_DELAY);
setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
incrCounter(callable.getType() + "#exceeded.concurrency", 1);
}
}
catch (Throwable t) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", callable == null ? "N/A" 
: callable.getName(),
t.getMessage(), t);
}
finally {
if (callable != null) {
callableEnd(callable);
}
}
}
}
 {code}
 

 

  was:
when fork actions parallel submit will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-07-30 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748911#comment-17748911
 ] 

Hadoop QA commented on OOZIE-3717:
--


Testing JIRA OOZIE-3717

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 1 star import(s)
.{color:red}-1{color} the patch contains 4 line(s) longer than 132 
characters
.{color:green}+1{color} the patch adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} Javadoc generation succeeded with the patch
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:red}-1{color} There are [3] new bugs found below threshold in total that 
must be fixed.
.{color:red}-1{color} There are [3] new bugs found below threshold in 
[core] that must be fixed.
.You can find the SpotBugs diff here (look for the red and orange ones): 
core/findbugs-new.html
.The most important SpotBugs errors are:
.At BulkJPAExecutor.java:[line 206]: This use of 
javax/persistence/EntityManager.createQuery(Ljava/lang/String;)Ljavax/persistence/Query;
 can be vulnerable to SQL/JPQL injection
.At BulkJPAExecutor.java:[line 176]: At BulkJPAExecutor.java:[line 175]
.At BulkJPAExecutor.java:[line 205]: At BulkJPAExecutor.java:[line 199]
.Unsafe comparison of hash that are susceptible to timing attack: At 
BulkJPAExecutor.java:[line 206]
.At ShareLibService.java:[line 689]: At ShareLibService.java:[line 695]
.{color:green}+1{color} There are no new bugs found in [client].
.{color:green}+1{color} There are no new bugs found in [docs].
.{color:green}+1{color} There are no new bugs found in 
[fluent-job/fluent-job-api].
.{color:green}+1{color} There are no new bugs found in [server].
.{color:green}+1{color} There are no new bugs found in [examples].
.{color:green}+1{color} There are no new bugs found in [tools].
.{color:green}+1{color} There are no new bugs found in [webapp].
.{color:green}+1{color} There are no new bugs found in [sharelib/distcp].
.{color:green}+1{color} There are no new bugs found in [sharelib/spark].
.{color:green}+1{color} There are no new bugs found in [sharelib/oozie].
.{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
.{color:green}+1{color} There are no new bugs found in [sharelib/streaming].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive2].
.{color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive].
.{color:green}+1{color} There are no new bugs found in [sharelib/pig].
.{color:green}+1{color} There are no new bugs found in [sharelib/git].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 3262
.{color:orange}Tests failed at first run:{color}
TestSignalXCommand#testDeadlockForForkParallelSubmit
.For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 
{color:green}+1 MODERNIZER{color}


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

. https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/212/



> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assign

[jira] [Commented] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause de

2023-07-30 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748891#comment-17748891
 ] 

Hadoop QA commented on OOZIE-3717:
--

PreCommit-OOZIE-Build started


> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order :
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callable
> .getType(), CONCURRENCY_DELAY);
> setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Attachment: OOZIE-3717-002.patch

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch, OOZIE-3717-002.patch
>
>
> when fork actions parallel submit will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Thread 1 and Thread 2 execute CallableWrapper's execute function order :
>   1. Thread 1 execute removeFromUniqueCallables; 
>   2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
> uniqueCallables;
>   3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
> filterDuplicates() function found a same name XCommand in uniqueCallables, so 
> skip add to queue;
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost(never execute), and the thread that 
> fork actions parallel submit block at CallableQueueService.blockingWait(). 
> {code}
>  
> *CallableWrapper's code*
> {code:java}
> public class CallableWrapper extends PriorityDelayQueue.QueueElement 
> implements Runnable, Callable {
> private Instrumentation.Cron cron;
> public void run() {
> XCallable callable = null;
> try {
> removeFromUniqueCallables();
> if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
> log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
> [{1}]ms delay", getElement().getType(),
> SAFE_MODE_DELAY);
> setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> return;
> }
> callable = getElement();
> if (callableBegin(callable)) {
> cron.stop();
> addInQueueCron(cron);
> XLog log = XLog.getLog(getClass());
> log.trace("executing callable [{0}]", callable.getName());
> try {
> //FutureTask.run() will invoke cllable.call()
> super.run();
> incrCounter(INSTR_EXECUTED_COUNTER, 1);
> log.trace("executed callable [{0}]", callable.getName());
> }
> catch (Exception ex) {
> incrCounter(INSTR_FAILED_COUNTER, 1);
> log.warn("exception callable [{0}], {1}", 
> callable.getName(), ex.getMessage(), ex);
> }
> }
> else {
> log.warn("max concurrency for callable [{0}] exceeded, 
> requeueing with [{1}]ms delay", callable
> .getType(), CONCURRENCY_DELAY);
> setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
> queue(this, true);
> incrCounter(callable.getType() + "#exceeded.con

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
when fork actions parallel submit will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Thread 1 and Thread 2 execute CallableWrapper's execute function order :
  1. Thread 1 execute removeFromUniqueCallables; 
  2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
uniqueCallables;
  3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
filterDuplicates() function found a same name XCommand in uniqueCallables, so 
skip add to queue;

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
ForkedActionStartXCommand would be lost(never execute), and the thread that 
fork actions parallel submit block at CallableQueueService.blockingWait(). 
{code}
 
*CallableWrapper's code*
{code:java}
public class CallableWrapper extends PriorityDelayQueue.QueueElement 
implements Runnable, Callable {
private Instrumentation.Cron cron;

public void run() {
XCallable callable = null;
try {
removeFromUniqueCallables();
if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
[{1}]ms delay", getElement().getType(),
SAFE_MODE_DELAY);
setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
return;
}
callable = getElement();
if (callableBegin(callable)) {
cron.stop();
addInQueueCron(cron);
XLog log = XLog.getLog(getClass());
log.trace("executing callable [{0}]", callable.getName());

try {
//FutureTask.run() will invoke cllable.call()
super.run();
incrCounter(INSTR_EXECUTED_COUNTER, 1);
log.trace("executed callable [{0}]", callable.getName());
}
catch (Exception ex) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", 
callable.getName(), ex.getMessage(), ex);
}
}
else {
log.warn("max concurrency for callable [{0}] exceeded, 
requeueing with [{1}]ms delay", callable
.getType(), CONCURRENCY_DELAY);
setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
incrCounter(callable.getType() + "#exceeded.concurrency", 1);
}
}
catch (Throwable t) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", callable == null ? "N/A" 
: callable.getName(),
t.getMessage(), t);
}
finally {
if (callable != null) {
callableEnd(callable);
}
}
}
}
 {code}
 

 

  was:
Fork actions parallel submit, will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Thread 1 and Thread 2 execute CallableWrapper's execute function order :
  1. Thread 1 execute removeFromUniqueCallables; 
  2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
uniqueCallables;
  3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
filterDuplicates() function found a same name XCommand in uniqueCallables, so 
skip add to queue;

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
ForkedActionStartXCommand would be lost(never execute), and the thread that 
fork actions parallel submit block at CallableQueueService.blockingWait(). 
{code}
 
*CallableWrapper's code*
{code:java}
public class CallableWrapper extends PriorityDelayQueue.QueueElement 
implements Runnable, Callable {
private Instrumentation.Cron cron;

public void run() {
XCallable callable = null;
try {
removeFromUniqueCallables();
if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
[{1}]ms delay", getElement().getType(),
SAFE_MODE_DELAY);
setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
return;
}
callable = getElement();
if (callableBegin(callable)) {
cron.stop();
addInQueueCron(cron);
XLog log = XLog.getLog(getClass());
log.trace("executing callable [{0}]", callable.getName());

try {
//FutureTask.run() will invoke cllable.call()
super.run();
incrCounter(INSTR_EXECUTED_COUNTER, 1);
log.trace("executed callable [{0}]", callable.getName());
}
catch (Exception ex) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", 
callable.getName(), ex.getMessage(), ex);
}
}
else {
log.warn("max concurrency for callable [{0}] exceeded, 
requeueing with [{1}]ms delay", callable
.getType(), CONCURRENCY_DELAY);
setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
incrCounter(callable.getType() + "#exceeded.concurrency", 1);
}
}
catch (Throwable t) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", callable == null ? "N/A" 
: callable.getName(),
t.getMessage(), t);
}
finally {
if (callable != null) {
callableEnd(callable);
}
}
}
}
 {code}
 

 

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Thread 1 and Thread 2 execute CallableWrapper's execute function order :
  1. Thread 1 execute removeFromUniqueCallables; 
  2. Thread 2 execute queue add ActionStartXCommand into queue and add to 
uniqueCallables;
  3. Thread 1 execute queue add ForkedActionStartXCommand into queue, but 
filterDuplicates() function found a same name XCommand in uniqueCallables, so 
skip add to queue;

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
ForkedActionStartXCommand would be lost(never execute), and the thread that 
fork actions parallel submit block at CallableQueueService.blockingWait(). 
{code}
 
*CallableWrapper's code*
{code:java}
public class CallableWrapper extends PriorityDelayQueue.QueueElement 
implements Runnable, Callable {
private Instrumentation.Cron cron;

public void run() {
XCallable callable = null;
try {
removeFromUniqueCallables();
if (Services.get().getSystemMode() == SYSTEM_MODE.SAFEMODE) {
log.info("Oozie is in SAFEMODE, requeuing callable [{0}] with 
[{1}]ms delay", getElement().getType(),
SAFE_MODE_DELAY);
setDelay(SAFE_MODE_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
return;
}
callable = getElement();
if (callableBegin(callable)) {
cron.stop();
addInQueueCron(cron);
XLog log = XLog.getLog(getClass());
log.trace("executing callable [{0}]", callable.getName());

try {
//FutureTask.run() will invoke cllable.call()
super.run();
incrCounter(INSTR_EXECUTED_COUNTER, 1);
log.trace("executed callable [{0}]", callable.getName());
}
catch (Exception ex) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", 
callable.getName(), ex.getMessage(), ex);
}
}
else {
log.warn("max concurrency for callable [{0}] exceeded, 
requeueing with [{1}]ms delay", callable
.getType(), CONCURRENCY_DELAY);
setDelay(CONCURRENCY_DELAY, TimeUnit.MILLISECONDS);
queue(this, true);
incrCounter(callable.getType() + "#exceeded.concurrency", 1);
}
}
catch (Throwable t) {
incrCounter(INSTR_FAILED_COUNTER, 1);
log.warn("exception callable [{0}], {1}", callable == null ? "N/A" 
: callable.getName(),
t.getMessage(), t);
}
finally {
if (callable != null) {
callableEnd(callable);
}
}
}
}
 {code}
 

 

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
ForkedActionStartXCommand would be lost, and the thread that fork actions 
parallel submit block at CallableQueueService.blockingWait(). {code}

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}


> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> Thread 2 add ActionStartXCommand enqueue before Thread 1, so 
> ForkedActionStartXCommand would be lost, and the thread that fork actions 
> parallel submit block at CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) When fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause dead



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Summary: When fork actions parallel submit, becasue 
ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and cause deadlock  (was: Fork actions 
parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has 
the same name, so ForkedActionStartXCommand would be lost, and cause deadlock)

> When fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> --
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadloc

2023-07-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748883#comment-17748883
 ] 

Hadoop QA commented on OOZIE-3717:
--


Testing JIRA OOZIE-3717

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 1 star import(s)
.{color:red}-1{color} the patch contains 4 line(s) longer than 132 
characters
.{color:green}+1{color} the patch adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} Javadoc generation succeeded with the patch
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warning(s)
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:red}-1{color} There are [3] new bugs found below threshold in total that 
must be fixed.
.{color:green}+1{color} There are no new bugs found in [examples].
.{color:green}+1{color} There are no new bugs found in 
[fluent-job/fluent-job-api].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive].
.{color:green}+1{color} There are no new bugs found in [sharelib/hive2].
.{color:green}+1{color} There are no new bugs found in [sharelib/git].
.{color:green}+1{color} There are no new bugs found in [sharelib/distcp].
.{color:green}+1{color} There are no new bugs found in [sharelib/hcatalog].
.{color:green}+1{color} There are no new bugs found in [sharelib/sqoop].
.{color:green}+1{color} There are no new bugs found in [sharelib/spark].
.{color:green}+1{color} There are no new bugs found in [sharelib/oozie].
.{color:green}+1{color} There are no new bugs found in [sharelib/pig].
.{color:green}+1{color} There are no new bugs found in [sharelib/streaming].
.{color:green}+1{color} There are no new bugs found in [server].
.{color:green}+1{color} There are no new bugs found in [docs].
.{color:green}+1{color} There are no new bugs found in [webapp].
.{color:red}-1{color} There are [3] new bugs found below threshold in 
[core] that must be fixed.
.You can find the SpotBugs diff here (look for the red and orange ones): 
core/findbugs-new.html
.The most important SpotBugs errors are:
.At BulkJPAExecutor.java:[line 206]: This use of 
javax/persistence/EntityManager.createQuery(Ljava/lang/String;)Ljavax/persistence/Query;
 can be vulnerable to SQL/JPQL injection
.At BulkJPAExecutor.java:[line 176]: At BulkJPAExecutor.java:[line 175]
.At BulkJPAExecutor.java:[line 205]: At BulkJPAExecutor.java:[line 199]
.Unsafe comparison of hash that are susceptible to timing attack: At 
BulkJPAExecutor.java:[line 206]
.At ShareLibService.java:[line 689]: At ShareLibService.java:[line 695]
.{color:green}+1{color} There are no new bugs found in [tools].
.{color:green}+1{color} There are no new bugs found in [client].
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 3262
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 
{color:green}+1 MODERNIZER{color}


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

. https://ci-hadoop.apache.org/job/PreCommit-OOZIE-Build/211/



> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch
>
>
> Fork actions parallel submit, so will add ForkedActi

[jira] [Commented] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadloc

2023-07-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17748871#comment-17748871
 ] 

Hadoop QA commented on OOZIE-3717:
--

PreCommit-OOZIE-Build started


> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
> Attachments: OOZIE-3717-001.patch
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Attachment: (was: OOZIE-3717-001.patch)

> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Attachment: OOZIE-3717-001.patch

> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Attachment: (was: OOZIE-3717-001.patch)

> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
Thread 1  Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
     Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}


> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
> Thread 1  Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
     Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  .  |
++          +-+
|   ..   |          |  queue  |
++          +-+
|   queue|       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
     Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  . |
++          +-+
|         ..    |          |  queue  |
++          +-+
|         queue     |       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}


> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
>      Thread 1                   Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  .  |
> ++          +-+
> |   ..   |          |  queue  |
> ++          +-+
> |   queue|       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock



 [ 
https://issues.apache.org/jira/browse/OOZIE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenhaodan updated OOZIE-3717:
--
Description: 
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
     Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  . |
++          +-+
|         ..    |          |  queue  |
++          +-+
|         queue     |       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and block at 
CallableQueueService.blockingWait(). {code}

  was:
Fork actions parallel submit, so will add ForkedActionStartXCommand and 

RecoveryService will check pending action may add ActionStartXCommand, if 
ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
action) in queue, it would be lose. The thread parallel submit actions block at 
CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
finish, but ForkedActionStartXCommand had lost and cause deadlock.
{code:java}
     Thread 1                   Thread 2
  (ForkedActionStartXCommand)      (ActionStartXCommand)
++          +-+
| removeFromUniqueCallables  |          |  . |
++          +-+
|         ..    |          |  queue  |
++          +-+
|         queue     |       enqueue successed, in uniqueCallables
++ 
| wrapper.filterDuplicates() |
++

Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so 
ForkedActionStartXCommand would be lost, and CallableQueueService block at 
CallableQueueService.blockingWait(). {code}


> Fork actions parallel submit, becasue ForkedActionStartXCommand and 
> ActionStartXCommand has the same name, so ForkedActionStartXCommand would be 
> lost, and cause deadlock
> -
>
> Key: OOZIE-3717
> URL: https://issues.apache.org/jira/browse/OOZIE-3717
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: 5.2.1
>Reporter: chenhaodan
>Assignee: chenhaodan
>Priority: Major
> Fix For: trunk
>
>
> Fork actions parallel submit, so will add ForkedActionStartXCommand and 
> RecoveryService will check pending action may add ActionStartXCommand, if 
> ForkedActionStartXCommand enqueue and there is a ActionStartXCommand(the same 
> action) in queue, it would be lose. The thread parallel submit actions block 
> at CallableQueueService.blockingWait() wait for ForkedActionStartXCommand  to 
> finish, but ForkedActionStartXCommand had lost and cause deadlock.
> {code:java}
>      Thread 1                   Thread 2
>   (ForkedActionStartXCommand)      (ActionStartXCommand)
> ++          +-+
> | removeFromUniqueCallables  |          |  . |
> ++          +-+
> |         ..    |          |  queue  |
> ++          +-+
> |         queue     |       enqueue successed, in uniqueCallables
> ++ 
> | wrapper.filterDuplicates() |
> ++
> Becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, 
> so ForkedActionStartXCommand would be lost, and block at 
> CallableQueueService.blockingWait(). {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (OOZIE-3717) Fork actions parallel submit, becasue ForkedActionStartXCommand and ActionStartXCommand has the same name, so ForkedActionStartXCommand would be lost, and cause deadlock