[ 
https://issues.apache.org/jira/browse/NIFI-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Payne reassigned NIFI-1135:
--------------------------------

    Assignee: Mark Payne

> For Provenance Query, bring back Event Summaries instead of the Events 
> themselves
> ---------------------------------------------------------------------------------
>
>                 Key: NIFI-1135
>                 URL: https://issues.apache.org/jira/browse/NIFI-1135
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Core UI
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>
> Currently, when we query Provenance, we pull back up to 1000 events. These 
> are full Provenance Events with attributes, etc. If the query takes a long 
> time, we will request those objects that already have matched the query many 
> times. This amounts to a great deal of heap being used and sending back very 
> large JSON objects (10+ MB is not uncommon and it could potentially be far 
> worse).
> We should instead use a ProvenanceEventSummary object. This object should 
> contain just the info shown in the results table and the pointer to the 
> actual event in the Provenance Store. This allows us to return the queries 
> much faster, store less data in the heap, and provide less data back to the 
> end user with virtually the same experience.
> The one place that this would differ in UX is when the user clicks the "info" 
> button to view the entire provenance event, we would have to pull the event 
> back from the server, rather than already having that in memory.
> We should consider storing all of the fields in the results table in Lucene 
> to provide faster results. Otherwise, we could still get potentially better 
> results with the current approach if we just ensure that the first fields 
> that we store are those in the results table. This allows us to read just a 
> small portion of the event from file and deserializing just a small amount of 
> data before moving on to the next event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to