[ 
https://issues.apache.org/jira/browse/NIFI-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Medel resolved NIFI-7411.
-------------------------------
    Resolution: Resolved

NiFi + H2O Driverless AI MOJO Processor will not be merged into the project. 
Instead it will be managed, maintained by H2O repo: 
[https://github.com/h2oai/dai-deployment-examples/pull/18]

> Integrates NiFi with H2O Driverless AI MOJO Scoring Pipeline (Java Runtime) 
> To Do ML Inference
> ----------------------------------------------------------------------------------------------
>
>                 Key: NIFI-7411
>                 URL: https://issues.apache.org/jira/browse/NIFI-7411
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>    Affects Versions: 1.12.0
>         Environment: Mac OS X Mojave 10.14.6
>            Reporter: James Medel
>            Priority: Major
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> *NiFi and H2O Driverless AI Integration* via Custom NiFi Processor:
> Integrates NiFi with H2O Driverless AI by using Driverless AI's MOJO Scoring 
> Pipeline (in Java Runtime) and NiFi's Custom Processor. This processor 
> executes the MOJO Scoring Pipeline to do batch scoring or real-time scoring 
> for one or more predicted labels on tabular data in the incoming flow file 
> content. If the tabular data is one row, then the MOJO does real-time 
> scoring. If the tabular data is multiple rows, then the MOJO does batch 
> scoring. I would like to contribute my processor to NiFi as a new feature.
> *1 Custom Processor* created for NiFi:
> *ExecuteMojoScoringRecord* - Executes H2O Driverless AI's MOJO Scoring 
> Pipeline in Java Runtime to do batch scoring or real-time scoring on a frame 
> of data within each incoming flow file. It requires the user to add 
> *mojo2-runtime.jar* filepath into *MOJO2 Runtime JAR Directory* ** property 
> to dynamically modify the classpath. It also requires the user to add the 
> *pipeline.mojo* filepath into the *Pipeline MOJO Filepath* property. This 
> property is used in the onTrigger() method to get the pipeline.mojo filepath, 
> so we can pass it into the
> MojoPipeline.loadFrom(pipelineMojoPath) to instantiate our MojoPipeline 
> model. Then the record read in with Record Reader and the model are passed 
> into a predict() method to make predictions on the test data within the 
> record. Inside the predict() method, I use MojoFrameBuilder and 
> MojoRowBuilder with the recordMap to build an input MojoFrame. Then I use the 
> model's transform(input MojoFrame) method to make the predictions on the 
> input and store them into an output MojoFrame. I iterate through the 
> MojoFrame by row and column to store each key value pair prediction into the 
> predictedRecordMap. I then convert the predictedRecordMap to predictedRecord 
> and return the record back to onTrigger to write the record to the flow file 
> content using RecordSetWriter. We keep writing predicted Records to the flow 
> file content until there are no more records to write. Then we reach near end 
> of onTrigger() and the flow file is either transferred on relationship 
> failure, success or original to the next connection.
>  
> *Hydraulic System Condition Monitoring* Data used in NiFi Flow:
>  
> The sensor test data I used in this integration comes from UCI ML Repo: 
> Condition Monitoring for Hydraulic Systems. I was able to predict the 
> hydraulic cooling condition through NiFi and H2O Integration described above. 
> This use case is hydraulic system predictive maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to