[ https://issues.apache.org/jira/browse/NIFI-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Medel resolved NIFI-7411. ------------------------------- Resolution: Resolved NiFi + H2O Driverless AI MOJO Processor will not be merged into the project. Instead it will be managed, maintained by H2O repo: [https://github.com/h2oai/dai-deployment-examples/pull/18] > Integrates NiFi with H2O Driverless AI MOJO Scoring Pipeline (Java Runtime) > To Do ML Inference > ---------------------------------------------------------------------------------------------- > > Key: NIFI-7411 > URL: https://issues.apache.org/jira/browse/NIFI-7411 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions > Affects Versions: 1.12.0 > Environment: Mac OS X Mojave 10.14.6 > Reporter: James Medel > Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > *NiFi and H2O Driverless AI Integration* via Custom NiFi Processor: > Integrates NiFi with H2O Driverless AI by using Driverless AI's MOJO Scoring > Pipeline (in Java Runtime) and NiFi's Custom Processor. This processor > executes the MOJO Scoring Pipeline to do batch scoring or real-time scoring > for one or more predicted labels on tabular data in the incoming flow file > content. If the tabular data is one row, then the MOJO does real-time > scoring. If the tabular data is multiple rows, then the MOJO does batch > scoring. I would like to contribute my processor to NiFi as a new feature. > *1 Custom Processor* created for NiFi: > *ExecuteMojoScoringRecord* - Executes H2O Driverless AI's MOJO Scoring > Pipeline in Java Runtime to do batch scoring or real-time scoring on a frame > of data within each incoming flow file. It requires the user to add > *mojo2-runtime.jar* filepath into *MOJO2 Runtime JAR Directory* ** property > to dynamically modify the classpath. It also requires the user to add the > *pipeline.mojo* filepath into the *Pipeline MOJO Filepath* property. This > property is used in the onTrigger() method to get the pipeline.mojo filepath, > so we can pass it into the > MojoPipeline.loadFrom(pipelineMojoPath) to instantiate our MojoPipeline > model. Then the record read in with Record Reader and the model are passed > into a predict() method to make predictions on the test data within the > record. Inside the predict() method, I use MojoFrameBuilder and > MojoRowBuilder with the recordMap to build an input MojoFrame. Then I use the > model's transform(input MojoFrame) method to make the predictions on the > input and store them into an output MojoFrame. I iterate through the > MojoFrame by row and column to store each key value pair prediction into the > predictedRecordMap. I then convert the predictedRecordMap to predictedRecord > and return the record back to onTrigger to write the record to the flow file > content using RecordSetWriter. We keep writing predicted Records to the flow > file content until there are no more records to write. Then we reach near end > of onTrigger() and the flow file is either transferred on relationship > failure, success or original to the next connection. > > *Hydraulic System Condition Monitoring* Data used in NiFi Flow: > > The sensor test data I used in this integration comes from UCI ML Repo: > Condition Monitoring for Hydraulic Systems. I was able to predict the > hydraulic cooling condition through NiFi and H2O Integration described above. > This use case is hydraulic system predictive maintenance. -- This message was sent by Atlassian Jira (v8.3.4#803005)