[ 
https://issues.apache.org/jira/browse/OOZIE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Salamon updated OOZIE-3382:
----------------------------------
    Description: 
OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} blocks 
and modified the {{drainBuffers}} method to keep draining the standard output 
(and standard error) continuously.

Right now the speed of the drain is hardwired. As long as the process is 
running the method only reads 1024 bytes in each cycle (half a second) which 
can take very long time if we want to drain several megabytes (for instance 
{{oozie.servlet.CallbackServlet.max.data.len}} is increased).

Let's optimize the draining.

We can either read 1024 bytes multiple times in each cycle (as long as there 
are data in the buffer), or we can increase the value of the buffer size 
(1024). 

In the latter case the default of the buffer size could be half of the 
{{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an 
additional property to specify the buffer size (to avoid memory problems 
because of using a very big buffer). We can keep 1024 as a minimum buffer size. 

It would be also useful to refactor the code and put the buffer draining into a 
separate class and create unit tests for the class. Using this class in 
{{ShellMain}} to avoid code duplication would also be very useful, but we have 
to fix OOZIE-3359 first.

  was:
OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} blocks 
and modified the {{drainBuffers}} method to keep draining the standard output 
(and standard error) continuously.

Right now the speed of the drain is hardwired. As long as the process is 
running the method only reads 1024 bytes in each cycle (half a second) which 
can take very long time if we want to drain several megabytes (for instance 
{{oozie.servlet.CallbackServlet.max.data.len}} is increased).

Let's optimize the draining.

We can either read 1024 bytes multiple times in each cycle (as long as there 
are data in the buffer), or we can increase the value of the buffer size 
(1024). 

In the latter case the default of the buffer size could be half of the 
{{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an 
additional property to specify the buffer size (to avoid memory problems 
because of using a very big buffer). We can keep 1024 as a minimum buffer size. 


> Optimize SshActionExecutor's drainBuffers method
> ------------------------------------------------
>
>                 Key: OOZIE-3382
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3382
>             Project: Oozie
>          Issue Type: Improvement
>            Reporter: Andras Salamon
>            Assignee: Andras Salamon
>            Priority: Major
>
> OOZIE-3354 improved {{SshActionExecutor}} to avoid {{Process#waitFor()}} 
> blocks and modified the {{drainBuffers}} method to keep draining the standard 
> output (and standard error) continuously.
> Right now the speed of the drain is hardwired. As long as the process is 
> running the method only reads 1024 bytes in each cycle (half a second) which 
> can take very long time if we want to drain several megabytes (for instance 
> {{oozie.servlet.CallbackServlet.max.data.len}} is increased).
> Let's optimize the draining.
> We can either read 1024 bytes multiple times in each cycle (as long as there 
> are data in the buffer), or we can increase the value of the buffer size 
> (1024). 
> In the latter case the default of the buffer size could be half of the 
> {{oozie.servlet.CallbackServlet.max.data.len}} value, but we also need an 
> additional property to specify the buffer size (to avoid memory problems 
> because of using a very big buffer). We can keep 1024 as a minimum buffer 
> size. 
> It would be also useful to refactor the code and put the buffer draining into 
> a separate class and create unit tests for the class. Using this class in 
> {{ShellMain}} to avoid code duplication would also be very useful, but we 
> have to fix OOZIE-3359 first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to