[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174902#comment-15174902 ] Noble Paul edited comment on SOLR-8349 at 3/2/16 9:29 AM: -- Thanks. the tests are fine. But for one thing, The blob store is not guaranteed to be available at core load time (the {{.system}} collection ). So , your component should implement {{BlobStoreAware}} and only in the callback for that interface , the class should try to load resources . The BlobStoreAware interface is not yet implemented SOLR-8772 was (Author: noble.paul): Thanks. the tests are fine. But for one thing, The blob store is not guaranteed to be available at core load time (the {{.system}} collection ). So , your component should implement {{BlobStoreAware}} and only in the callback for that interface , the class should try to load resources > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166522#comment-15166522 ] Noble Paul edited comment on SOLR-8349 at 2/25/16 1:37 AM: --- One problem I see with the patch is, with decoding the bytebuffer in two different ways . What if core1 has decoder1 and core2 has decoder2. Then the second call gets the output of first decoder. That's why I kept a map internally so that it is possible to deal with that usecase. It may be unusual to do so , but, for sake of correctness we have to do it was (Author: noble.paul): One problem I see with the patch is, with decoding the object in two different ways . What if core1 has decoder1 and core2 has decoder2. Then the second call gets the output of first decoder. That's why I kept a map internally so that it is possible to deal with that usecase. It may be unusual to do so , but, for sake of correctness we have to do it > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, > SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156323#comment-15156323 ] Noble Paul edited comment on SOLR-8349 at 2/22/16 6:21 AM: --- bq.So are you proposing changing JarRepository, or adding a similar class called BlobRepository? No, I would rename it to BlobRepository bq. Seems like fairly major surgery is required to make the JarRepository class fully generic. I shall put up a patch which can do this bq.I need to better understand the lazy="true" bit you mentioned, I understand the problem with {{startup=lazy}} we probably should make a new interface called {{BlobStoreAware}} which loads the component when the BlobStore is available. But let's keep it separate was (Author: noble.paul): bq.So are you proposing changing JarRepository, or adding a similar class called BlobRepository? No, I would rename it to BlobRepository bq. Seems like fairly major surgery is required to make the JarRepository class fully generic. I shall put up a patch which can do this bq.I need to better understand the lazy="true" bit you mentioned, I understand the problem with {{startup=lazy}} we probably should make a new interface called {{BlobStoreAware}} which loads the component when the BlobStore is available. But let's not keep it separate > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156319#comment-15156319 ] Gus Heck edited comment on SOLR-8349 at 2/22/16 12:30 AM: -- hmm looks like I didn't realize my ide had navigated me to another class getByteBuffer is on MemClassLoader... Ah I see no I confused the method name when I wrote the comment I meant to say getFileContent (on JarContent, or it's equivalent) couldn't return a ByteBuffer was (Author: gus_heck): hmm looks like I didn't realize my ide had navigated me to another class getByteBuffer is on MemClassLoader... > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156295#comment-15156295 ] Noble Paul edited comment on SOLR-8349 at 2/21/16 11:38 PM: bq.That may come in handy if/when a refresh mechanism is desired. The refresh mechanism is built into the system . The name is a combination of name and version example: {{myCsvFile/1}} (where '1' is the first version) Every component can be updated using the API. If you add a new version of the blob, you should just simply update your component using the config API to use {{myCsvFile/2}}. The rest is automatically taken care of was (Author: noble.paul): bq.That may come in handy if/when a refresh mechanism is desired. The refresh mechanism is built into the system . The name is a combination of name and version example: {{myCsvFile/1}} (where '1' is the first version) Every component can be updated using the API. If you add a new version of the blob, you should just simply update your component to use {{myCsvFile/2}}. The rest is automatically taken care of > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck >Assignee: Noble Paul > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156293#comment-15156293 ] Noble Paul edited comment on SOLR-8349 at 2/21/16 11:33 PM: OK , I got the point. The framework can be easily extended to make the decoder pluggable. So the cache can just keep the decoded object in memory instead of ByteBuffer So the API would look like {code} MyCsvDecoder csvDecoder = null;//initiate your decoder that would convert your csv to MyCustomObject ObjectRef ref = BlobRepository.getObjectIncRef(name, csvDecoder ); {code} was (Author: noble.paul): OK , I got the point. The framework can be easily extended to making the codec pluggable. So the cache can just keep the decoded object in memory instead of ByteBuffer So the API would look like {code} MyCsvDecoder csvDecoder = null;//initiate your decoder that would convert your csv to MyCustomObject ObjectRef ref = BlobRepository.getObjectIncRef(name, csvDecoder ); {code} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156236#comment-15156236 ] Noble Paul edited comment on SOLR-8349 at 2/21/16 9:26 PM: --- [~gus_heck] There is another easy existing solution for this problem How to do this 1) Store your large file in the blob store 2) Use {{blobRef = JarRepository.getJarIncRef(name)}} to get the content (we will change the method names for it to make sense for you) 3) Make your component register as a closehook 4) In the {{postClose()}}, do a {{blobref.decrementJarRefCount()}} The advantages of this solution are, 1) You get a version managed store for your large files without screwing up your ZK 2) There are APIs to manage the blob 3) It is already refcounted etc caveats are 1) It does not work for Standalone. We can extend it to do that 2) You will have to make your component {{startup=lazy}} was (Author: noble.paul): [~gus_heck] There is another easy existing solution for this problem How to do this 1) Store your large file in the blob store 2) Use {{blobRef = JarRepository.getJarIncRef(name)}} to get the content (we will change the method names for it to make sense for you) 3) Make your component register as a closehook 4) In the {{postClose()}}, do a {{blobref.decrementJarRefCount()}} The advantages of this solution are, 1) You get a version managed store for your large files without screwing up your ZK 2) It is already refcounted etc caveats are 1) It does not work for Standalone. We can extend it to do that 2) You will have to make your component {{startup=lazy}} > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch, SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148797#comment-15148797 ] Gus Heck edited comment on SOLR-8349 at 2/16/16 3:28 PM: - Executive summary of the last overly long comment: # Guava's cache is indeed a cool cache, # It doesn't support arbitrary loaders in a way that is consistent with my design, # We can do one of these things (AFAICT): ## Use the working code I wrote (no guava cache) ## Change our goals, and use guava (allow cores loading the same resource to all block each other until loading is done) ## Use guava and wrap it in additional loader management code of similar complexity as my original code. # Weak/soft values require someone to hold the strong reference. New thought this morning: I could probably add methods and a list to the SolrCore object for the purpose of giving it a reference to the resource at load time, thus tying the life-cycle of the resource to the object we want it to live and die with. Then weak values would probably work fine. was (Author: gus_heck): Executive summary of the last overly long comment: # Guava's cache is indeed a cool cache, # It doesn't support arbitrary loaders in a way that is consistent with my design, # Either we can do one of these things (AFAICT): ## Use the working code I wrote (no guava cache) ## Change our goals, and use guava (allow cores loading the same resource to all block each other until loading is done) ## Use guava and wrap it in additional loader management code of similar complexity as my original code. # Weak/soft values require someone to hold the strong reference. New thought this morning: I could probably add methods and a list to the SolrCore object for the purpose of giving it a reference to the resource at load time, thus tying the life-cycle of the resource to the object we want it to live and die with. Then weak values would probably work fine. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142799#comment-15142799 ] Gus Heck edited comment on SOLR-8349 at 2/11/16 2:32 PM: - I am not sure I understand what you mean by load the analysis component into the container cache. The analysis component in lucene needs some way to reference the container and ask for the shared resource. My assumption was that it was not ok to add dependencies to solr classes into Lucene. Otherwise I could just cast to SolrResourceLoader in the analysis component in question. Did you look at SOLR-3443 where I use this patch to implement the feature for HunspellStemFilterFactory.java. do this: {code} if (loader instanceof ContainerResourceSharing) { resourceSharing = (ContainerResourceSharing) loader; {code} this then allows me to have a method like: {code} public Dictionary getDictionary() { return this.dictionary == null ? (Dictionary) resourceSharing.getContainerResource(resourceKey) : this.dictionary; } {code} If I can use a Solr Class, that could just as easily say {code} if (loader instanceof SolrResourceLoader) { resourceSharing = (SolrResourceLoader) loader; {code} One could also add methods to ResourceLoader itself, but then all forms of resource loader have to deal with implementing this method. was (Author: gus_heck): I am not sure I understand what you mean by load the analysis component into the container cache. The analysis component in lucene needs some way to reference the container and ask for the shared resource. My assumption was that it was not ok to add dependencies to solr dependencies into Lucene. Otherwise I could just cast to SolrResourceLoader in the analysis component in question. Did you look at SOLR-3443 where I use this patch to implement the feature for HunspellStemFilterFactory.java. do this: {code} if (loader instanceof ContainerResourceSharing) { resourceSharing = (ContainerResourceSharing) loader; {code} this then allows me to have a method like: {code} public Dictionary getDictionary() { return this.dictionary == null ? (Dictionary) resourceSharing.getContainerResource(resourceKey) : this.dictionary; } {code} If I can use a Solr Class, that could just as easily say {code} if (loader instanceof SolrResourceLoader) { resourceSharing = (SolrResourceLoader) loader; {code} One could also add methods to ResourceLoader itself, but then all forms of resource loader have to deal with implementing this method. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores
[ https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031082#comment-15031082 ] Gus Heck edited comment on SOLR-8349 at 12/2/15 7:53 PM: - Patch implementing general feature (on 6.x). Patch for SOLR-3443 to be attached there shortly. was (Author: gus_heck): Patch implementing general feature. Patch for SOLR-3443 to be attached there shortly. > Allow sharing of large in memory data structures across cores > - > > Key: SOLR-8349 > URL: https://issues.apache.org/jira/browse/SOLR-8349 > Project: Solr > Issue Type: Improvement > Components: Server >Affects Versions: 5.3 >Reporter: Gus Heck > Attachments: SOLR-8349.patch > > > In some cases search components or analysis classes may utilize a large > dictionary or other in-memory structure. When multiple cores are loaded with > identical configurations utilizing this large in memory structure, each core > holds it's own copy in memory. This has been noted in the past and a specific > case reported in SOLR-3443. This patch provides a generalized capability, and > if accepted, this capability will then be used to fix SOLR-3443. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org