[
https://issues.apache.org/jira/browse/OOZIE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574681#comment-13574681
]
Rohini Palaniswamy commented on OOZIE-1200:
-------------------------------------------
1* CoordActionInputCheckXCommand, execute() calls updateCoordAction in finally
BLOCK, if there is an exception halfway the execution this will save an
inconsistent state, this update should be done at the end of the try BLOCK.
Don't want to change this behaviour now without taking time to investigate
why it was done that way. The pathExists() calls sets the error code and error
message on the action, but the CoordActionUpdateForInputCheckJPAExecutor does
not update the action with those. And in many cases the error message is not
updated and action stays in WAITING state and users have to see the logs to
determine what is the problem. Will create a separate jira to track this.
2* CoordActionInputCheckXCommand, pathExists() can extract the user from the
conf within itself instead doing that outside and passing it as parameter.
Moved the user out as the pathExists() call is done in a loop and user does
not change.
11* CoordinatorFunctionalSpec, section #6.8.3, shouldn't map-reduce be
mentioned as a type too?
No. It is not possible to have a MRAction because we do not have support for
calling the setter apis on HCatInputFormat and HCatOutputFormat. We will have
to design a HCatMRAction in future to take care of that.
22* HCatURIHandler, the exits() signature that takes a context should not close
the context, the caller is responsible for that, so it can reuse the context
It does not close the context.
21* HCatURIHandler/FSURIHandler/URIHandler, why 2 signatures of the exists()?
the one with the context should be enough
32* URIHandler, why registerForNotification() does not take a URIContext? (#)
URIContext API is for cases where we know all the dependencies have same
scheme and authority and can reuse the context. In other cases it does not make
each code make another call to get URIContext and then call exists() method.
In case of latest/future range, all the dependencies will be pointing to same
hcat server, db and table or same hdfs location. But in the case of missing
dependencies which contains dependencies from different dataset input event,
each dependency could point to different servers(Especially with the Y!
complicated case of 24 clusters) and there could be a mix of hdfs, hcat, hftp
and webhdfs schemes. If needed we can optimize later, grouping all dependencies
with same authority and get the uricontext for them. But since HCatClient and
FileSystem both have caching internally, not worried much about performance
impact now.
34* WaitingAction, why is serializable? (#)
For the other cache implementation which can spill to disk
Have addressed the other code comments. Working on the documentation. Will post
a patch by EOD.
Will work with Virag on getting the missing commits synched.
> Review of HCAT integration branch
> ---------------------------------
>
> Key: OOZIE-1200
> URL: https://issues.apache.org/jira/browse/OOZIE-1200
> Project: Oozie
> Issue Type: Sub-task
> Components: coordinator, docs
> Reporter: Alejandro Abdelnur
> Fix For: trunk
>
> Attachments: HCat-commandcoord-package-review.patch,
> HCat-depedency-cache-review.patch, HCat-dependency-package-review.patch,
> Hcat-service-util-tools-action-jms-packages-review-1.patch,
> Hcat-service-util-tools-action-jms-packages-review.patch,
> HCatSpec-comments.patch
>
>
> Given the number of commits (and fixes) that when into this branch, I think
> the best way to review it is by posting patches with comments through out the
> code. Then we can follow up the discussion as comments in this JIRA.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira