[jira] [Commented] (MINIFICPP-39) Create FocusArchive processor
[ https://issues.apache.org/jira/browse/MINIFICPP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204354#comment-16204354 ] marco polo commented on MINIFICPP-39: - [~calebj] An alternative is to use log attributes to do validation through the log output. Not a great way for validation, but we'll get to the current_flowfile_ eventually. When we have a chance and would always welcome contributions there. > Create FocusArchive processor > - > > Key: MINIFICPP-39 > URL: https://issues.apache.org/jira/browse/MINIFICPP-39 > Project: NiFi MiNiFi C++ > Issue Type: Task >Reporter: Andrew Christianson >Assignee: Andrew Christianson >Priority: Minor > > Create an FocusArchive processor which implements a lens over an archive > (tar, etc.). A concise, though informal, definition of a lens is as follows: > "Essentially, they represent the act of “peering into” or “focusing in on” > some particular piece/path of a complex data object such that you can more > precisely target particular operations without losing the context or > structure of the overall data you’re working with." > https://medium.com/@dtipson/functional-lenses-d1aba9e52254#.hdgsvbraq > Why an FocusArchive in MiNiFi? Simply put, it will enable us to "focus in on" > an entry in the archive, perform processing *in-context* of that entry, then > re-focus on the overall archive. This allows for transformation or other > processing of an entry in the archive without losing the overall context of > the archive. > Initial format support is tar, due to its simplicity and ubiquity. > Attributes: > - Path (the path in the archive to focus; "/" to re-focus the overall archive) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFIREG-31) Update Admin/Workflow tab
[ https://issues.apache.org/jira/browse/NIFIREG-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204176#comment-16204176 ] ASF GitHub Bot commented on NIFIREG-31: --- GitHub user scottyaslan opened a pull request: https://github.com/apache/nifi-registry/pull/21 [NIFIREG-31] update buckets data table to include search/filter capab… …ilities and add bucket creation dialog You can merge this pull request into a Git repository by running: $ git pull https://github.com/scottyaslan/nifi-registry NIFIREG-31 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi-registry/pull/21.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21 commit 166a6364fa4953239f6d10f366bc68a9ff7ac334 Author: Scott AslanDate: 2017-10-13T20:33:59Z [NIFIREG-31] update buckets data table to include search/filter capabilities and add bucket creation dialog > Update Admin/Workflow tab > - > > Key: NIFIREG-31 > URL: https://issues.apache.org/jira/browse/NIFIREG-31 > Project: NiFi Registry > Issue Type: Sub-task >Reporter: Scott Aslan > > Rename Workflow tab to buckets. > Increase width of buckets container to match the users. > Add user search type functionality to the buckets data table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi-registry pull request #21: [NIFIREG-31] update buckets data table to in...
GitHub user scottyaslan opened a pull request: https://github.com/apache/nifi-registry/pull/21 [NIFIREG-31] update buckets data table to include search/filter capab⦠â¦ilities and add bucket creation dialog You can merge this pull request into a Git repository by running: $ git pull https://github.com/scottyaslan/nifi-registry NIFIREG-31 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi-registry/pull/21.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21 commit 166a6364fa4953239f6d10f366bc68a9ff7ac334 Author: Scott AslanDate: 2017-10-13T20:33:59Z [NIFIREG-31] update buckets data table to include search/filter capabilities and add bucket creation dialog ---
[jira] [Commented] (NIFI-4471) Set flow limits at process group level
[ https://issues.apache.org/jira/browse/NIFI-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204100#comment-16204100 ] Kevin Risden commented on NIFI-4471: Currently in NiFi there is nothing stopped a user from setting 10 million as the connection flowfile queue object count. The UI/API allows basically any number to be put in there. This is an easy way to cause instability in the cluster by a user doing what is allowed in the UI. > Set flow limits at process group level > -- > > Key: NIFI-4471 > URL: https://issues.apache.org/jira/browse/NIFI-4471 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework >Reporter: Haimo Liu > > In a multi-tenancy type of operational environment, as a NIFI admin user, I > want to be able to set some limits at the Process Group level, to prevent my > NIFI server from being stressed out. > 1. I want to say "no connection's limit may be set higher than xxx MB." > 2. "I can queue no more than xxx FFs at any connections" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4472) add alerts when a node is disconnected in my NIFI cluster
[ https://issues.apache.org/jira/browse/NIFI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204097#comment-16204097 ] Kevin Risden commented on NIFI-4472: Currently we use Ambari to managed HDF and the only Ambari alert is for the NiFi process being down (pid not alive). We added a custom alert to Ambari to check when an HDF node disconnects. > add alerts when a node is disconnected in my NIFI cluster > - > > Key: NIFI-4472 > URL: https://issues.apache.org/jira/browse/NIFI-4472 > Project: Apache NiFi > Issue Type: New Feature > Components: Core Framework >Reporter: Haimo Liu > > when a NIFI node is disconnected from my cluster, it would be nice that I can > get timely alters/notifications. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4481) Identify which processors have been configured for "primary node" only
[ https://issues.apache.org/jira/browse/NIFI-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Gilman updated NIFI-4481: -- Status: Patch Available (was: In Progress) > Identify which processors have been configured for "primary node" only > -- > > Key: NIFI-4481 > URL: https://issues.apache.org/jira/browse/NIFI-4481 > Project: Apache NiFi > Issue Type: Improvement > Components: Core UI >Reporter: Rob Moran >Assignee: Matt Gilman >Priority: Minor > Attachments: primary-processor-id.png > > > Possibly identify on canvas components and in the Summary table -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi pull request #2210: NIFI-4481: Visualize Processors Running on Primary ...
GitHub user mcgilman opened a pull request: https://github.com/apache/nifi/pull/2210 NIFI-4481: Visualize Processors Running on Primary Node NIFI-4481: - Adding support for visualizing if a component is scheduled for primary node only. [Visualization](https://issues.apache.org/jira/secure/attachment/12891747/primary-processor-id.png) should only be available in cluster mode where the Execution Node configuration is available. In cluster mode, the nodes on the canvas should show the (P) icon when configured for Execution Node: Primary. Like Run Status, this indication will be available regardless of permission for the component. Additionally, this indication is available in the Summary Table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mcgilman/nifi NIFI-4481 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2210.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2210 commit 3f0d9e05c7c1aea0bd03a999c16893fb308436ca Author: Matt GilmanDate: 2017-10-12T17:58:50Z NIFI-4481: - Adding support for visualizing if a component is scheduled for primary node only. ---
[jira] [Commented] (NIFI-4481) Identify which processors have been configured for "primary node" only
[ https://issues.apache.org/jira/browse/NIFI-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204008#comment-16204008 ] ASF GitHub Bot commented on NIFI-4481: -- GitHub user mcgilman opened a pull request: https://github.com/apache/nifi/pull/2210 NIFI-4481: Visualize Processors Running on Primary Node NIFI-4481: - Adding support for visualizing if a component is scheduled for primary node only. [Visualization](https://issues.apache.org/jira/secure/attachment/12891747/primary-processor-id.png) should only be available in cluster mode where the Execution Node configuration is available. In cluster mode, the nodes on the canvas should show the (P) icon when configured for Execution Node: Primary. Like Run Status, this indication will be available regardless of permission for the component. Additionally, this indication is available in the Summary Table. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mcgilman/nifi NIFI-4481 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2210.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2210 commit 3f0d9e05c7c1aea0bd03a999c16893fb308436ca Author: Matt GilmanDate: 2017-10-12T17:58:50Z NIFI-4481: - Adding support for visualizing if a component is scheduled for primary node only. > Identify which processors have been configured for "primary node" only > -- > > Key: NIFI-4481 > URL: https://issues.apache.org/jira/browse/NIFI-4481 > Project: Apache NiFi > Issue Type: Improvement > Components: Core UI >Reporter: Rob Moran >Assignee: Matt Gilman >Priority: Minor > Attachments: primary-processor-id.png > > > Possibly identify on canvas components and in the Summary table -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Gilman resolved NIFI-4484. --- Resolution: Fixed Fix Version/s: 1.5.0 > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > Fix For: 1.5.0 > > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4444) Upgrade Jersey Versions
[ https://issues.apache.org/jira/browse/NIFI-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204000#comment-16204000 ] ASF GitHub Bot commented on NIFI-: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2206 > Upgrade Jersey Versions > --- > > Key: NIFI- > URL: https://issues.apache.org/jira/browse/NIFI- > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.4.0 >Reporter: Matt Gilman >Assignee: Matt Gilman > Fix For: 1.5.0 > > Attachments: NIFI-.xml > > > Need to upgrade to a newer version of Jersey. The primary motivation is to > upgrade the version used within NiFi itself. However, there are a number of > extensions that also leverage it. Of those extensions, some utilize the older > version defined in dependencyManagement while others override explicitly > within their own bundle dependencyManagement. For this JIRA I propose > removing the Jersey artifacts from the root pom and allow the version to be > specified on a bundle by bundle basis. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203999#comment-16203999 ] ASF GitHub Bot commented on NIFI-4484: -- Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2209 > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203997#comment-16203997 ] ASF GitHub Bot commented on NIFI-4484: -- Github user mcgilman commented on the issue: https://github.com/apache/nifi/pull/2209 Thanks @andrewmlim! This has been merged to master. > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi pull request #2209: NIFI-4484 Update screenshots in User Guide for Repo...
Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2209 ---
[GitHub] nifi pull request #2206: NIFI-4444: Upgrade to Jersey 2.x
Github user asfgit closed the pull request at: https://github.com/apache/nifi/pull/2206 ---
[GitHub] nifi issue #2209: NIFI-4484 Update screenshots in User Guide for Reporting T...
Github user mcgilman commented on the issue: https://github.com/apache/nifi/pull/2209 Thanks @andrewmlim! This has been merged to master. ---
[jira] [Commented] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203996#comment-16203996 ] ASF subversion and git services commented on NIFI-4484: --- Commit 0e3d83c3b848bed5e32c93371375aa4514137986 in nifi's branch refs/heads/master from [~andrewmlim] [ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=0e3d83c ] NIFI-4484 Update screenshots in User Guide for Reporting Task Controller Services tab. This closes #2209 > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203990#comment-16203990 ] ASF GitHub Bot commented on NIFI-4484: -- Github user mcgilman commented on the issue: https://github.com/apache/nifi/pull/2209 Will review... > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi issue #2209: NIFI-4484 Update screenshots in User Guide for Reporting T...
Github user mcgilman commented on the issue: https://github.com/apache/nifi/pull/2209 Will review... ---
[jira] [Commented] (MINIFICPP-72) Add tar and compression support for MergeContent
[ https://issues.apache.org/jira/browse/MINIFICPP-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203763#comment-16203763 ] bqiu commented on MINIFICPP-72: --- Aldrin, Currently NIFI support merge format tar and zip. I already committed merge content for minifi. This jira is to add merge fomat tar and zip for minifI so that merge content processor for minifi has feature parity with nifi. I will add compress content processor for minifi in different jira > Add tar and compression support for MergeContent > > > Key: MINIFICPP-72 > URL: https://issues.apache.org/jira/browse/MINIFICPP-72 > Project: NiFi MiNiFi C++ > Issue Type: New Feature >Affects Versions: 1.0.0 >Reporter: bqiu > Fix For: 1.0.0 > > > Add tar and compression support for MergeContent > will use the https://www.libarchive.org -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MINIFICPP-39) Create FocusArchive processor
[ https://issues.apache.org/jira/browse/MINIFICPP-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203718#comment-16203718 ] Andrew Christianson commented on MINIFICPP-39: -- [~calebj], it looks like we currently don't have that facility in the cpp unit test framework. So for now, the main options are to use the putfile, and verify the output after it's written to disk, or to use the docker system integration test framework. The docs on the docker system integration test framework are at https://github.com/apache/nifi-minifi-cpp/blob/master/docker/test/integration/README.md. To do this, you'd have to add support for these processors (simple python classes), then create a custom OutputValidator (look at the SingleFileOutputValidator for an example). > Create FocusArchive processor > - > > Key: MINIFICPP-39 > URL: https://issues.apache.org/jira/browse/MINIFICPP-39 > Project: NiFi MiNiFi C++ > Issue Type: Task >Reporter: Andrew Christianson >Assignee: Andrew Christianson >Priority: Minor > > Create an FocusArchive processor which implements a lens over an archive > (tar, etc.). A concise, though informal, definition of a lens is as follows: > "Essentially, they represent the act of “peering into” or “focusing in on” > some particular piece/path of a complex data object such that you can more > precisely target particular operations without losing the context or > structure of the overall data you’re working with." > https://medium.com/@dtipson/functional-lenses-d1aba9e52254#.hdgsvbraq > Why an FocusArchive in MiNiFi? Simply put, it will enable us to "focus in on" > an entry in the archive, perform processing *in-context* of that entry, then > re-focus on the overall archive. This allows for transformation or other > processing of an entry in the archive without losing the overall context of > the archive. > Initial format support is tar, due to its simplicity and ubiquity. > Attributes: > - Path (the path in the archive to focus; "/" to re-focus the overall archive) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-4383: - Status: Patch Available (was: Open) > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.4.0, 1.3.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > Labels: records > > At the moment, if trying to use the update record to update the elements of > an array it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > When elements of the array are records, it's possible to update fields of the > record but not the record itself as-is. > Also in the MultiArrayIndexPath implementation, index of array elements is > not correctly provided. Because of that, wrong elements of the array could be > updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4484) Update doc screenshots in User Guide section "Adding Controller Services for Reporting Tasks"
[ https://issues.apache.org/jira/browse/NIFI-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203605#comment-16203605 ] ASF GitHub Bot commented on NIFI-4484: -- GitHub user andrewmlim opened a pull request: https://github.com/apache/nifi/pull/2209 NIFI-4484 Update screenshots in User Guide for Reporting Task Controller Services tab You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewmlim/nifi NIFI-4484 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2209.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2209 commit 6baea8ccffe93e6ea6289cac8970f95e95f797bf Author: Matt GilmanDate: 2017-10-02T21:01:31Z NIFI-: - Upgrading to Jersey 2.x. - Updating NOTICE files where necessary. - Fixing checkstyle issues. This closes #2206. Signed-off-by: Andy LoPresto commit f86148f6bce18a9b9b63f1527c868b96b12188e2 Author: Andrew Lim Date: 2017-10-12T20:01:54Z NIFI-4484 Update screenshots in User Guide for Reporting Task Controller Services tab > Update doc screenshots in User Guide section "Adding Controller Services for > Reporting Tasks" > - > > Key: NIFI-4484 > URL: https://issues.apache.org/jira/browse/NIFI-4484 > Project: Apache NiFi > Issue Type: Improvement > Components: Documentation & Website >Affects Versions: 1.4.0 >Reporter: Andrew Lim >Assignee: Andrew Lim >Priority: Minor > > Per NIFI-3941, the tab name was changed from "Controller Services" to > "Reporting Task Controller Services". > The relevant screenshots need to be updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi pull request #2209: NIFI-4484 Update screenshots in User Guide for Repo...
GitHub user andrewmlim opened a pull request: https://github.com/apache/nifi/pull/2209 NIFI-4484 Update screenshots in User Guide for Reporting Task Controller Services tab You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewmlim/nifi NIFI-4484 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2209.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2209 commit 6baea8ccffe93e6ea6289cac8970f95e95f797bf Author: Matt GilmanDate: 2017-10-02T21:01:31Z NIFI-: - Upgrading to Jersey 2.x. - Updating NOTICE files where necessary. - Fixing checkstyle issues. This closes #2206. Signed-off-by: Andy LoPresto commit f86148f6bce18a9b9b63f1527c868b96b12188e2 Author: Andrew Lim Date: 2017-10-12T20:01:54Z NIFI-4484 Update screenshots in User Guide for Reporting Task Controller Services tab ---
[jira] [Commented] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203579#comment-16203579 ] ASF GitHub Bot commented on NIFI-4383: -- GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2208 NIFI-4383 - Fix UpdateRecord when updating arrays elements Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-4383 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2208.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2208 commit 5edcfa9ac7833de62ba8b3198c0ad32b16239035 Author: Pierre VillardDate: 2017-10-12T21:51:09Z NIFI-4383 - Fix UpdateRecord when updating arrays elements > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.3.0, 1.4.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > Labels: records > > At the moment, if trying to use the update record to update the elements of > an array it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > When elements of the array are records, it's possible to update fields of the > record but not the record itself as-is. > Also in the MultiArrayIndexPath implementation, index of array elements is > not correctly provided. Because of that, wrong elements of the array could be > updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi pull request #2208: NIFI-4383 - Fix UpdateRecord when updating arrays e...
GitHub user pvillard31 opened a pull request: https://github.com/apache/nifi/pull/2208 NIFI-4383 - Fix UpdateRecord when updating arrays elements Thank you for submitting a contribution to Apache NiFi. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with NIFI- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder? - [ ] Have you written or updated unit tests to verify your changes? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly? - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly? - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/pvillard31/nifi NIFI-4383 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nifi/pull/2208.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2208 commit 5edcfa9ac7833de62ba8b3198c0ad32b16239035 Author: Pierre VillardDate: 2017-10-12T21:51:09Z NIFI-4383 - Fix UpdateRecord when updating arrays elements ---
[jira] [Updated] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-4383: - Labels: records (was: ) > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.3.0, 1.4.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > Labels: records > > At the moment, if trying to use the update record to update the elements of > an array it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > When elements of the array are records, it's possible to update fields of the > record but not the record itself as-is. > Also in the MultiArrayIndexPath implementation, index of array elements is > not correctly provided. Because of that, wrong elements of the array could be > updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-4383: - Affects Version/s: 1.4.0 > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.3.0, 1.4.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > Labels: records > > At the moment, if trying to use the update record to update the elements of > an array it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > When elements of the array are records, it's possible to update fields of the > record but not the record itself as-is. > Also in the MultiArrayIndexPath implementation, index of array elements is > not correctly provided. Because of that, wrong elements of the array could be > updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-4383: - Description: At the moment, if trying to use the update record to update the elements of an array it won't have any effect. Input: {noformat} { "numbers" : [ 1, null, 4 ] } {noformat} Parameters: ||Path||Value||Expected output|| |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| When elements of the array are records, it's possible to update fields of the record but not the record itself as-is. Also in the MultiArrayIndexPath implementation, index of array elements is not correctly provided. Because of that, wrong elements of the array could be updated. was: At the moment, if trying to use the update record to update an array of simple fields (not records) it won't have any effect. Input: {noformat} { "numbers" : [ 1, null, 4 ] } {noformat} Parameters: ||Path||Value||Expected output|| |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > > At the moment, if trying to use the update record to update the elements of > an array it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| > When elements of the array are records, it's possible to update fields of the > record but not the record itself as-is. > Also in the MultiArrayIndexPath implementation, index of array elements is > not correctly provided. Because of that, wrong elements of the array could be > updated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4383) UpdateRecord - cannot update arrays elements
[ https://issues.apache.org/jira/browse/NIFI-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-4383: - Summary: UpdateRecord - cannot update arrays elements (was: Fix UpdateRecord when updating arrays of simple fields) > UpdateRecord - cannot update arrays elements > > > Key: NIFI-4383 > URL: https://issues.apache.org/jira/browse/NIFI-4383 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.3.0 >Reporter: Pierre Villard >Assignee: Pierre Villard > > At the moment, if trying to use the update record to update an array of > simple fields (not records) it won't have any effect. > Input: > {noformat} > { > "numbers" : [ 1, null, 4 ] > } > {noformat} > Parameters: > ||Path||Value||Expected output|| > |{{/numbers[*]}}|{{8}}|{{"numbers" : [ 8, 8, 8 ]}}| > |{{/numbers[1]}}|{{8}}|{{"numbers" : [ 1, 8, 4 ]}}| > |{{/numbers[0..1]}}|{{8}}|{{"numbers" : [ 8, 8, 4 ]}}| > |{{/numbers[0,2]}}|{{8}}|{{"numbers" : [ 8, null, 8 ]}}| -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (NIFI-4325) Create a new ElasticSearch processor that supports the JSON DSL
[ https://issues.apache.org/jira/browse/NIFI-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203507#comment-16203507 ] ASF GitHub Bot commented on NIFI-4325: -- Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2113 @mattyb149 I'm going to leave this open, but I decided to refactor the heck out of it around a client service for ElasticSearch. The service only has one method for now, but I think it's the way to go so that in the future as services become injectable in scripts and such, it'll be more flexible. > Create a new ElasticSearch processor that supports the JSON DSL > --- > > Key: NIFI-4325 > URL: https://issues.apache.org/jira/browse/NIFI-4325 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike Thomsen >Priority: Minor > > The existing ElasticSearch processors use the Lucene-style syntax for > querying, not the JSON DSL. A new processor is needed that can take a full > JSON query and execute it. It should also support aggregation queries in this > syntax. A user needs to be able to take a query as-is from Kibana and drop it > into NiFi and have it just run. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] nifi issue #2113: NIFI-4325 Added new processor that uses the JSON DSL.
Github user MikeThomsen commented on the issue: https://github.com/apache/nifi/pull/2113 @mattyb149 I'm going to leave this open, but I decided to refactor the heck out of it around a client service for ElasticSearch. The service only has one method for now, but I think it's the way to go so that in the future as services become injectable in scripts and such, it'll be more flexible. ---
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144532208 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -66,42 +79,64 @@ import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.CursorMarkParams; -@Tags({"Apache", "Solr", "Get", "Pull"}) +@Tags({"Apache", "Solr", "Get", "Pull", "Records"}) @InputRequirement(Requirement.INPUT_FORBIDDEN) -@CapabilityDescription("Queries Solr and outputs the results as a FlowFile") +@CapabilityDescription("Queries Solr and outputs the results as a FlowFile in the format of XML or using a Record Writer") +@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of Date Field so that the same data will not be fetched multiple times.") --- End diff -- GetSolr used to use local file to store lastEndDate. We need migration code so that lastEndDate to be taken over to managed state when there's no state but the lastEndDate file exists. ---
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203452#comment-16203452 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144533090 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -66,42 +79,64 @@ import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.CursorMarkParams; -@Tags({"Apache", "Solr", "Get", "Pull"}) +@Tags({"Apache", "Solr", "Get", "Pull", "Records"}) @InputRequirement(Requirement.INPUT_FORBIDDEN) -@CapabilityDescription("Queries Solr and outputs the results as a FlowFile") +@CapabilityDescription("Queries Solr and outputs the results as a FlowFile in the format of XML or using a Record Writer") +@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of Date Field so that the same data will not be fetched multiple times.") --- End diff -- State scope should be CLUSTER, I think. Also, capability description should mention that this processor is designed to run on Primary Node only. Please refer ListHDFS processor documentation. Or does this processor work nicely in distributed fashion by utilizing multiple NiFi nodes against a Solr cluster? > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144533090 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -66,42 +79,64 @@ import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.CursorMarkParams; -@Tags({"Apache", "Solr", "Get", "Pull"}) +@Tags({"Apache", "Solr", "Get", "Pull", "Records"}) @InputRequirement(Requirement.INPUT_FORBIDDEN) -@CapabilityDescription("Queries Solr and outputs the results as a FlowFile") +@CapabilityDescription("Queries Solr and outputs the results as a FlowFile in the format of XML or using a Record Writer") +@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of Date Field so that the same data will not be fetched multiple times.") --- End diff -- State scope should be CLUSTER, I think. Also, capability description should mention that this processor is designed to run on Primary Node only. Please refer ListHDFS processor documentation. Or does this processor work nicely in distributed fashion by utilizing multiple NiFi nodes against a Solr cluster? ---
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203447#comment-16203447 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144532208 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -66,42 +79,64 @@ import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.params.CursorMarkParams; -@Tags({"Apache", "Solr", "Get", "Pull"}) +@Tags({"Apache", "Solr", "Get", "Pull", "Records"}) @InputRequirement(Requirement.INPUT_FORBIDDEN) -@CapabilityDescription("Queries Solr and outputs the results as a FlowFile") +@CapabilityDescription("Queries Solr and outputs the results as a FlowFile in the format of XML or using a Record Writer") +@Stateful(scopes = {Scope.LOCAL}, description = "Stores latest date of Date Field so that the same data will not be fetched multiple times.") --- End diff -- GetSolr used to use local file to store lastEndDate. We need migration code so that lastEndDate to be taken over to managed state when there's no state but the lastEndDate file exists. > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144527126 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/test/resources/solr/testCollection/conf/schema.xml --- @@ -16,6 +16,16 @@ + + + + + + + + +id --- End diff -- What if Solr doc doesn't have an uniqueKey? Does this processor still work without uniqueKey?? ---
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144526595 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java --- @@ -275,7 +275,7 @@ protected final boolean isBasicAuthEnabled() { } @Override -protected final Collection customValidate(ValidationContext context) { +protected Collection customValidate(ValidationContext context) { --- End diff -- Shouldn't we add another protected method to override at sub-classes? ---
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203450#comment-16203450 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144526595 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/SolrProcessor.java --- @@ -275,7 +275,7 @@ protected final boolean isBasicAuthEnabled() { } @Override -protected final Collection customValidate(ValidationContext context) { +protected Collection customValidate(ValidationContext context) { --- End diff -- Shouldn't we add another protected method to override at sub-classes? > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real Time, but it's not real time. > This latency can be controlled by either on client side which requests the > update operation by specifying "commitWithin" parameter, or on the Solr > server side, "autoCommit" and "autoSoftCommit" in > [solrconfig.xml|https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig#UpdateHandlersinSolrConfig-Commits]. > Since commit and updating index can be costly, it's recommended to set this > interval long enough up to the maximum tolerable latency. > However, this can be problematic with GetSolr. For instance, as shown in the > simple NiFi flow
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203448#comment-16203448 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144530918 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -172,157 +203,196 @@ protected void init(final ProcessorInitializationContext context) { @Override public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { -lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE); +clearState.set(true); --- End diff -- Probably we'd like to clear state only when following properties get changed? It would be a bad UX if state is cleared when user re-configure batch size. - SOLR_TYPE - SOLR_LOCATION - COLLECTION - SOLR_QUERY - DATE_FIELD - RETURN_FIELDS > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real Time, but it's not real time. > This latency can be controlled by either on client side which requests the > update operation by specifying "commitWithin" parameter, or on the Solr > server side, "autoCommit" and "autoSoftCommit" in >
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203449#comment-16203449 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144530989 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -138,10 +168,11 @@ protected void init(final ProcessorInitializationContext context) { descriptors.add(SOLR_TYPE); descriptors.add(SOLR_LOCATION); descriptors.add(COLLECTION); +descriptors.add(RETURN_TYPE); +descriptors.add(RECORD_WRITER); descriptors.add(SOLR_QUERY); -descriptors.add(RETURN_FIELDS); -descriptors.add(SORT_CLAUSE); --- End diff -- Is it safe to remove an existing property? The existing code should not sort result anyway, or should store last sorted field value to paginate properly when docs with the same date split more than one page. So I think it's safe.. > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real Time, but it's not real time. > This latency can be controlled by either on client side which requests the > update operation by specifying "commitWithin" parameter, or on the Solr > server side, "autoCommit" and "autoSoftCommit" in >
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203446#comment-16203446 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144527126 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/test/resources/solr/testCollection/conf/schema.xml --- @@ -16,6 +16,16 @@ + + + + + + + + +id --- End diff -- What if Solr doc doesn't have an uniqueKey? Does this processor still work without uniqueKey?? > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real Time, but it's not real time. > This latency can be controlled by either on client side which requests the > update operation by specifying "commitWithin" parameter, or on the Solr > server side, "autoCommit" and "autoSoftCommit" in > [solrconfig.xml|https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig#UpdateHandlersinSolrConfig-Commits]. > Since commit and updating index can be costly, it's recommended to set this > interval long enough up to the maximum tolerable latency. > However, this can be problematic with GetSolr. For instance, as shown in the > simple NiFi flow below, GetSolr can miss updated documents: > {code} > t1: GetSolr queried > t2:
[jira] [Commented] (NIFI-3248) GetSolr can miss recently updated documents
[ https://issues.apache.org/jira/browse/NIFI-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203451#comment-16203451 ] ASF GitHub Bot commented on NIFI-3248: -- Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144533800 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -172,157 +203,196 @@ protected void init(final ProcessorInitializationContext context) { @Override public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { -lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE); +clearState.set(true); } -@OnStopped -public void onStopped() { -writeLastEndDate(); -} +@OnScheduled +public void onScheduled2(final ProcessContext context) throws IOException { --- End diff -- Please change method name appropriately to represent what it does, such as `clearState`. The annotation explains when it's called. > GetSolr can miss recently updated documents > --- > > Key: NIFI-3248 > URL: https://issues.apache.org/jira/browse/NIFI-3248 > Project: Apache NiFi > Issue Type: Bug > Components: Extensions >Affects Versions: 1.0.0, 0.5.0, 0.6.0, 0.5.1, 0.7.0, 0.6.1, 1.1.0, 0.7.1, > 1.0.1 >Reporter: Koji Kawamura >Assignee: Johannes Peter > Attachments: nifi-flow.png, query-result-with-curly-bracket.png, > query-result-with-square-bracket.png > > > GetSolr holds the last query timestamp so that it only fetches documents > those have been added or updated since the last query. > However, GetSolr misses some of those updated documents, and once the > documents date field value becomes older than last query timestamp, the > document won't be able to be queried by GetSolr any more. > This JIRA is for tracking the process of investigating this behavior, and > discussion on them. > Here are things that can be a cause of this behavior: > |#|Short description|Should we address it?| > |1|Timestamp range filter, curly or square bracket?|No| > |2|Timezone difference between update and query|Additional docs might be > helpful| > |3|Lag comes from NearRealTIme nature of Solr|Should be documented at least, > add 'commit lag-time'?| > h2. 1. Timestamp range filter, curly or square bracket? > At the first glance, using curly and square bracket in mix looked strange > ([source > code|https://github.com/apache/nifi/blob/support/nifi-0.5.x/nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java#L202]). > But these difference has a meaning. > The square bracket on the range query is inclusive and the curly bracket is > exclusive. If we use inclusive on both sides and a document has a time stamp > exactly on the boundary then it could be returned in two consecutive > executions, and we only want it in one. > This is intentional, and it should be as it is. > h2. 2. Timezone difference between update and query > Solr treats date fields as [UTC > representation|https://cwiki.apache.org/confluence/display/solr/Working+with+Dates|]. > If date field String value of an updated document represents time without > timezone, and NiFi is running on an environment using timezone other than > UTC, GetSolr can't perform date range query as users expect. > Let's say NiFi is running with JST(UTC+9). A process added a document to Solr > at 15:00 JST. But the date field doesn't have timezone. So, Solr indexed it > as 15:00 UTC. Then GetSolr performs range query at 15:10 JST, targeting any > documents updated from 15:00 to 15:10 JST. GetSolr formatted dates using UTC, > i.e. 6:00 to 6:10 UTC. The updated document won't be matched with the date > range filter. > To avoid this, updated documents must have proper timezone in date field > string representation. > If one uses NiFi expression language to set current timestamp to that date > field, following NiFi expression can be used: > {code} > ${now():format("-MM-dd'T'HH:mm:ss.SSSZ")} > {code} > It will produce a result like: > {code} > 2016-12-27T15:30:04.895+0900 > {code} > Then it will be indexed in Solr with UTC and will be queried by GetSolr as > expected. > h2. 3. Lag comes from NearRealTIme nature of Solr > Solr provides Near Real Time search capability, that means, the recently > updated documents can be queried in Near Real Time, but it's not real time. > This latency can be controlled by either on client side which requests the > update operation by specifying "commitWithin" parameter, or on the Solr > server side, "autoCommit" and "autoSoftCommit" in >
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144530989 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -138,10 +168,11 @@ protected void init(final ProcessorInitializationContext context) { descriptors.add(SOLR_TYPE); descriptors.add(SOLR_LOCATION); descriptors.add(COLLECTION); +descriptors.add(RETURN_TYPE); +descriptors.add(RECORD_WRITER); descriptors.add(SOLR_QUERY); -descriptors.add(RETURN_FIELDS); -descriptors.add(SORT_CLAUSE); --- End diff -- Is it safe to remove an existing property? The existing code should not sort result anyway, or should store last sorted field value to paginate properly when docs with the same date split more than one page. So I think it's safe.. ---
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144530918 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -172,157 +203,196 @@ protected void init(final ProcessorInitializationContext context) { @Override public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { -lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE); +clearState.set(true); --- End diff -- Probably we'd like to clear state only when following properties get changed? It would be a bad UX if state is cleared when user re-configure batch size. - SOLR_TYPE - SOLR_LOCATION - COLLECTION - SOLR_QUERY - DATE_FIELD - RETURN_FIELDS ---
[GitHub] nifi pull request #2199: NIFI-3248: Improvement of GetSolr Processor
Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2199#discussion_r144533800 --- Diff: nifi-nar-bundles/nifi-solr-bundle/nifi-solr-processors/src/main/java/org/apache/nifi/processors/solr/GetSolr.java --- @@ -172,157 +203,196 @@ protected void init(final ProcessorInitializationContext context) { @Override public void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { -lastEndDatedRef.set(UNINITIALIZED_LAST_END_DATE_VALUE); +clearState.set(true); } -@OnStopped -public void onStopped() { -writeLastEndDate(); -} +@OnScheduled +public void onScheduled2(final ProcessContext context) throws IOException { --- End diff -- Please change method name appropriately to represent what it does, such as `clearState`. The annotation explains when it's called. ---
[jira] [Created] (NIFI-4485) It should be possible to boot the NiFi engine via the Java API
Peter Horvath created NIFI-4485: --- Summary: It should be possible to boot the NiFi engine via the Java API Key: NIFI-4485 URL: https://issues.apache.org/jira/browse/NIFI-4485 Project: Apache NiFi Issue Type: Improvement Reporter: Peter Horvath Class {{org.apache.nifi.NiFi}} was not designed with extensibility or programmatic access in mind. This class is the entry point of the engine, however, the current implementation does not allow a potential caller (e.g. an integration test harness) to bootstrap the engine and then shut it down properly. Please Change this so that a NiFi instance can be started via the Java API: Introduce a separate class, which allows the engine to be started in "embedded" mode, this should be basically an extension to the existing class {{org.apache.nifi.NiFi#NiFi}}, but with some enhancements: The constructor {{org.apache.nifi.NiFi#NiFi}} registers an {{UncaughtExceptionHandler}}, a JVM {{Shutdown Hook}} and changes logging framework settings. These should NOT happen in embedded mode; in addition to that, it should be possible to shut the engine down via the API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (NIFI-4424) org.apache.nifi.NiFi does not allow programmatic access to the NiFi engine
[ https://issues.apache.org/jira/browse/NIFI-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Horvath updated NIFI-4424: Description: Class {{org.apache.nifi.NiFi}} was not designed with extensibility or programmatic access in mind. This class is the entry point of the engine, however, the current implementation does not allow a potential caller (e.g. an integration test harness) to bootstrap the engine and then shut it down properly: The main method {{org.apache.nifi.NiFi#main}} simply logs any exception, which is fine when started from the command line, however prevents programmatic usage and detecting error conditions (Exceptions) that would be essential to programatically access it from an integration test. The constructor {{org.apache.nifi.NiFi#NiFi}} registers an {{UncaughtExceptionHandler}}, a JVM {{Shutdown Hook}} and changes logging framework settings. *Please change this behaviour:* Expose *two* methods, one of which accepts the command line argument one would pass to the NiFi process and another one, which allows the NiFiProperties object to be passed. This method should return the {{NiFi}} object instance for further programmatic access. The logic used to register {{UncaughtExceptionHandler}}, a JVM Shutdown Hook and changing logging framework settings should be extracted to a {{protected}} *instance* method so that a client can override their behaviour with a NO-OP. A second class called e.g. {{org.apache.nifi.EmbeddedNiFi}} could be introduced as a base class for this use-case, where the engine is started through the Java API. *Please note these changes are baby-steps towards the implementation of a NiFi integration test harness.* was: Class {{org.apache.nifi.NiFi}} was not designed with extensibility or programmatic access in mind. This class is the entry point of the engine, however, the current implementation does not allow a potential caller (e.g. an integration test harness) to bootstrap the engine and then shut it down properly: The main method {{org.apache.nifi.NiFi#main}} simply logs any exception, which is fine when started from the command line, however prevents programmatic usage and detecting error conditions (Exceptions) that would be essential to programatically access it from an integration test. The constructor {{org.apache.nifi.NiFi#NiFi}} registers an {{UncaughtExceptionHandler}}, a JVM {{Shutdown Hook}} and changes logging framework settings. *Please change this behaviour:* Expose *two* methods, one of which accepts the command line argument one would pass to the NiFi process and another one, which allows the NiFiProperties object to be passed. This method should return the {{NiFi}} object instance for further programmatic access. The logic used to register {{UncaughtExceptionHandler}}, a JVM Shutdown Hook and changing logging framework settings should be extracted to a {{protected}} *instance* method so that a client can override their behaviour with a NO-OP. *Please note these changes are baby-steps towards the implementation of a NiFi integration test harness.* > org.apache.nifi.NiFi does not allow programmatic access to the NiFi engine > -- > > Key: NIFI-4424 > URL: https://issues.apache.org/jira/browse/NIFI-4424 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework >Affects Versions: 1.3.0 >Reporter: Peter Horvath > > Class {{org.apache.nifi.NiFi}} was not designed with extensibility or > programmatic access in mind. > This class is the entry point of the engine, however, the current > implementation does not allow > a potential caller (e.g. an integration test harness) to bootstrap the engine > and then shut it down properly: > The main method {{org.apache.nifi.NiFi#main}} simply logs any exception, > which is fine > when started from the command line, however prevents programmatic usage and > detecting error conditions (Exceptions) that would be essential to > programatically access > it from an integration test. > The constructor {{org.apache.nifi.NiFi#NiFi}} registers an > {{UncaughtExceptionHandler}}, > a JVM {{Shutdown Hook}} and changes logging framework settings. > *Please change this behaviour:* > Expose *two* methods, one of which accepts the command line argument one > would pass > to the NiFi process and another one, which allows the NiFiProperties object > to be passed. > This method should return the {{NiFi}} object instance for further > programmatic access. > The logic used to register {{UncaughtExceptionHandler}}, a JVM Shutdown Hook > and > changing logging framework settings should be extracted to a {{protected}} > *instance* > method so that a client can override their behaviour with a NO-OP. > A second class called e.g.