[jira] [Commented] (NIFI-2681) Avoid caching Provenance Index Searchers

2016-09-06 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467818#comment-15467818
 ] 

ASF subversion and git services commented on NIFI-2681:
---

Commit 088125451b7f15684752a914a3a83a834fac1a50 in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=0881254 ]

NIFI-2681: Refactored IndexManager into an interface and renamed the existing 
implementation to CachingIndexManager. Implemented a new SimpleIndexManager 
that performs no caching of IndexSearchers.

This closes #958.

Signed-off-by: Bryan Bende 


> Avoid caching Provenance Index Searchers
> 
>
> Key: NIFI-2681
> URL: https://issues.apache.org/jira/browse/NIFI-2681
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance 
> Repository closes a cached IndexSearcher too soon. The IndexManager keeps the 
> searchers cached in an effort to offer better performance when performing a 
> Provenance Query. This was done because it was recommended in the Lucene 
> documentation. However, we occasionally still see nodes crashing with 
> segfaults due to the Lucene Searching. We should update the Persistent 
> Provenance Repository to stop caching Index Searchers in order to trade a 
> slight performance improvement for significantly better reliability.
> Playing around with the idea in order to test it out shows very favorable 
> results. On a system where I could cause a seg fault almost every time that I 
> ran a large provenance query, I updated the code to no longer cache the 
> readers and saw perfect stability with no noticeable performance degradation.
> I will cleanup the code and submit a PR for these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2681) Avoid caching Provenance Index Searchers

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467820#comment-15467820
 ] 

ASF GitHub Bot commented on NIFI-2681:
--

Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/958


> Avoid caching Provenance Index Searchers
> 
>
> Key: NIFI-2681
> URL: https://issues.apache.org/jira/browse/NIFI-2681
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance 
> Repository closes a cached IndexSearcher too soon. The IndexManager keeps the 
> searchers cached in an effort to offer better performance when performing a 
> Provenance Query. This was done because it was recommended in the Lucene 
> documentation. However, we occasionally still see nodes crashing with 
> segfaults due to the Lucene Searching. We should update the Persistent 
> Provenance Repository to stop caching Index Searchers in order to trade a 
> slight performance improvement for significantly better reliability.
> Playing around with the idea in order to test it out shows very favorable 
> results. On a system where I could cause a seg fault almost every time that I 
> ran a large provenance query, I updated the code to no longer cache the 
> readers and saw perfect stability with no noticeable performance degradation.
> I will cleanup the code and submit a PR for these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NIFI-2681) Avoid caching Provenance Index Searchers

2016-09-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467763#comment-15467763
 ] 

ASF GitHub Bot commented on NIFI-2681:
--

Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/958#discussion_r77664996
  
--- Diff: 
nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/lucene/CachingIndexManager.java
 ---
@@ -0,0 +1,535 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.nifi.provenance.lucene;
+
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReentrantLock;
+
+import org.apache.lucene.analysis.Analyzer;
+import org.apache.lucene.analysis.standard.StandardAnalyzer;
+import org.apache.lucene.index.DirectoryReader;
+import org.apache.lucene.index.IndexWriter;
+import org.apache.lucene.index.IndexWriterConfig;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.store.Directory;
+import org.apache.lucene.store.FSDirectory;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class CachingIndexManager implements Closeable, IndexManager {
+private static final Logger logger = 
LoggerFactory.getLogger(CachingIndexManager.class);
+
+private final Lock lock = new ReentrantLock();
+private final Map writerCounts = new 
HashMap<>();
+private final Map activeSearchers = 
new HashMap<>();
+
+
+public void removeIndex(final File indexDirectory) {
+final File absoluteFile = indexDirectory.getAbsoluteFile();
+logger.info("Removing index {}", indexDirectory);
+
+lock.lock();
+try {
+final IndexWriterCount count = 
writerCounts.remove(absoluteFile);
+if ( count != null ) {
+try {
+count.close();
+} catch (final IOException ioe) {
+logger.warn("Failed to close Index Writer {} for {}", 
count.getWriter(), absoluteFile);
+if ( logger.isDebugEnabled() ) {
+logger.warn("", ioe);
+}
+}
+}
+
+for ( final List searcherList : 
activeSearchers.values() ) {
--- End diff --

@bbende wow - great catch. That definitely should have only closed the 
index searchers for the abosluteFile. Addressed and pushed.


> Avoid caching Provenance Index Searchers
> 
>
> Key: NIFI-2681
> URL: https://issues.apache.org/jira/browse/NIFI-2681
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance 
> Repository closes a cached IndexSearcher too soon. The IndexManager keeps the 
> searchers cached in an effort to offer better performance when performing a 
> Provenance Query. This was done because it was recommended in the Lucene 
> documentation. However, we occasionally still see nodes crashing with 
> segfaults due to the Lucene Searching. We should update the Persistent 
> Provenance Repository to stop caching Index Searchers in order to trade a 
> slight performance improvement for significantly better reliability.
> Playing around with the idea in order to test it out shows very favorable 
> results. On a system where I could cause a seg 

[jira] [Commented] (NIFI-2681) Avoid caching Provenance Index Searchers

2016-08-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15442158#comment-15442158
 ] 

ASF GitHub Bot commented on NIFI-2681:
--

GitHub user markap14 opened a pull request:

https://github.com/apache/nifi/pull/958

NIFI-2681: Refactored IndexManager into an interface and renamed the …

…existing implementation to CachingIndexManager. Implemented a new 
SimpleIndexManager that performs no caching of IndexSearchers.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markap14/nifi NIFI-2681

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #958


commit 58c715cefaeaa9ef2b7d89cd751456da73f42953
Author: Mark Payne 
Date:   2016-08-27T00:03:16Z

NIFI-2681: Refactored IndexManager into an interface and renamed the 
existing implementation to CachingIndexManager. Implemented a new 
SimpleIndexManager that performs no caching of IndexSearchers.




> Avoid caching Provenance Index Searchers
> 
>
> Key: NIFI-2681
> URL: https://issues.apache.org/jira/browse/NIFI-2681
> Project: Apache NiFi
>  Issue Type: Bug
>  Components: Core Framework
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Critical
> Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance 
> Repository closes a cached IndexSearcher too soon. The IndexManager keeps the 
> searchers cached in an effort to offer better performance when performing a 
> Provenance Query. This was done because it was recommended in the Lucene 
> documentation. However, we occasionally still see nodes crashing with 
> segfaults due to the Lucene Searching. We should update the Persistent 
> Provenance Repository to stop caching Index Searchers in order to trade a 
> slight performance improvement for significantly better reliability.
> Playing around with the idea in order to test it out shows very favorable 
> results. On a system where I could cause a seg fault almost every time that I 
> ran a large provenance query, I updated the code to no longer cache the 
> readers and saw perfect stability with no noticeable performance degradation.
> I will cleanup the code and submit a PR for these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)