[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-03-02 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174902#comment-15174902
 ] 

Noble Paul edited comment on SOLR-8349 at 3/2/16 9:29 AM:
--

Thanks. the tests are fine. But for one thing,

The blob store is not guaranteed to be available at core load time (the 
{{.system}} collection ).  So , your component should implement 
{{BlobStoreAware}} and only in the callback for that interface , the class 
should try to load resources . The BlobStoreAware interface is not yet 
implemented SOLR-8772


was (Author: noble.paul):
Thanks. the tests are fine. But for one thing,

The blob store is not guaranteed to be available at core load time (the 
{{.system}} collection ).  So , your component should implement 
{{BlobStoreAware}} and only in the callback for that interface , the class 
should try to load resources

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
>Assignee: Noble Paul
> Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, 
> SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, SOLR_8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-24 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166522#comment-15166522
 ] 

Noble Paul edited comment on SOLR-8349 at 2/25/16 1:37 AM:
---

One problem I see with the patch is, with decoding the bytebuffer in two 
different ways . What if core1 has decoder1 and core2  has decoder2. Then the 
second call gets the output of first decoder. That's why I kept a map 
internally so that it is possible to deal with that usecase. It may be unusual 
to do so , but, for sake of correctness we have to do it



was (Author: noble.paul):
One problem I see with the patch is, with decoding the object in two different 
ways . What if core1 has decoder1 and core2  has decoder2. Then the second call 
gets the output of first decoder. That's why I kept a map internally so that it 
is possible to deal with that usecase. It may be unusual to do so , but, for 
sake of correctness we have to do it


> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
>Assignee: Noble Paul
> Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch, 
> SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156323#comment-15156323
 ] 

Noble Paul edited comment on SOLR-8349 at 2/22/16 6:21 AM:
---

bq.So are you proposing changing JarRepository, or adding a similar class 
called BlobRepository?

No, I would rename it to BlobRepository 

bq. Seems like fairly major surgery is required to make the JarRepository class 
fully generic.

I shall put up a patch which can do this

bq.I need to better understand the lazy="true" bit you mentioned,

I understand the problem with {{startup=lazy}} we probably should make a new 
interface called {{BlobStoreAware}} which loads the component when the 
BlobStore is available. But let's keep it separate  


was (Author: noble.paul):
bq.So are you proposing changing JarRepository, or adding a similar class 
called BlobRepository?

No, I would rename it to BlobRepository 

bq. Seems like fairly major surgery is required to make the JarRepository class 
fully generic.

I shall put up a patch which can do this

bq.I need to better understand the lazy="true" bit you mentioned,

I understand the problem with {{startup=lazy}} we probably should make a new 
interface called {{BlobStoreAware}} which loads the component when the 
BlobStore is available. But let's not keep it separate  

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
>Assignee: Noble Paul
> Attachments: SOLR-8349.patch, SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-21 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156319#comment-15156319
 ] 

Gus Heck edited comment on SOLR-8349 at 2/22/16 12:30 AM:
--

hmm looks like I didn't realize my ide had navigated me to another class 
getByteBuffer is on MemClassLoader... Ah I see no I confused the method name 
when I wrote the comment I meant to say getFileContent (on JarContent, or it's 
equivalent) couldn't return a ByteBuffer


was (Author: gus_heck):
hmm looks like I didn't realize my ide had navigated me to another class 
getByteBuffer is on MemClassLoader...

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
>Assignee: Noble Paul
> Attachments: SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156295#comment-15156295
 ] 

Noble Paul edited comment on SOLR-8349 at 2/21/16 11:38 PM:


bq.That may come in handy if/when a refresh mechanism is desired.

The refresh mechanism is built into the system . The name is a combination of 
name and version example: {{myCsvFile/1}} (where '1' is the first version) 
Every component can be updated using the API. If you add a new version of the 
blob, you should just simply  update your component using the config API to use 
{{myCsvFile/2}}. The rest is automatically taken care of


was (Author: noble.paul):
bq.That may come in handy if/when a refresh mechanism is desired.

The refresh mechanism is built into the system . The name is a combination of 
name and version example: {{myCsvFile/1}} (where '1' is the first version) 
Every component can be updated using the API. If you add a new version of the 
blob, you should just simply  update your component to use {{myCsvFile/2}}. The 
rest is automatically taken care of

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
>Assignee: Noble Paul
> Attachments: SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156293#comment-15156293
 ] 

Noble Paul edited comment on SOLR-8349 at 2/21/16 11:33 PM:


OK , I got the point. The framework can be easily extended to make the decoder 
pluggable. So the cache can just keep the decoded object in memory instead of 
ByteBuffer

So the API  would look like
{code}
MyCsvDecoder csvDecoder = null;//initiate your decoder that 
would convert your csv to MyCustomObject
ObjectRef ref = BlobRepository.getObjectIncRef(name, 
csvDecoder );
{code}


was (Author: noble.paul):
OK , I got the point. The framework can be easily extended to making the codec 
pluggable. So the cache can just keep the decoded object in memory instead of 
ByteBuffer

So the API  would look like
{code}
MyCsvDecoder csvDecoder = null;//initiate your decoder that 
would convert your csv to MyCustomObject
ObjectRef ref = BlobRepository.getObjectIncRef(name, 
csvDecoder );
{code}

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
> Attachments: SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156236#comment-15156236
 ] 

Noble Paul edited comment on SOLR-8349 at 2/21/16 9:26 PM:
---

[~gus_heck] There is another easy existing solution for this problem

How to do this

1) Store your large file in the blob store
2) Use {{blobRef = JarRepository.getJarIncRef(name)}} to get the content (we 
will change the method names for it to make sense for you)
3) Make your component register as a closehook
4) In the {{postClose()}}, do a {{blobref.decrementJarRefCount()}}


The advantages of this solution are,

1) You get a version managed store for your large files without screwing up 
your ZK
2) There are APIs to manage the blob
3) It is already refcounted etc

caveats are

1) It does not work for Standalone. We can extend it to do that
2) You will have to make your component {{startup=lazy}}


was (Author: noble.paul):
[~gus_heck] There is another easy existing solution for this problem

How to do this

1) Store your large file in the blob store
2) Use {{blobRef = JarRepository.getJarIncRef(name)}} to get the content (we 
will change the method names for it to make sense for you)
3) Make your component register as a closehook
4) In the {{postClose()}}, do a {{blobref.decrementJarRefCount()}}


The advantages of this solution are,

1) You get a version managed store for your large files without screwing up 
your ZK
2) It is already refcounted etc

caveats are

1) It does not work for Standalone. We can extend it to do that
2) You will have to make your component {{startup=lazy}}

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
> Attachments: SOLR-8349.patch, SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-16 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148797#comment-15148797
 ] 

Gus Heck edited comment on SOLR-8349 at 2/16/16 3:28 PM:
-

Executive summary of the last overly long comment:

# Guava's cache is indeed a cool cache,
# It doesn't support arbitrary loaders in a way that is consistent with my 
design,
# We can do one of these things (AFAICT):
## Use the working code I wrote (no guava cache)
## Change our goals, and use guava  (allow cores loading the same resource to 
all block each other until loading is done)
## Use guava and wrap it in additional loader management code of similar 
complexity as my original code.
# Weak/soft values require someone to hold the strong reference. 

New thought this morning: I could probably add methods and a list to the 
SolrCore object for the purpose of giving it a reference to the resource at 
load time, thus tying the life-cycle of the resource to the object we want it 
to live and die with. Then weak values would probably work fine.


was (Author: gus_heck):
Executive summary of the last overly long comment:

# Guava's cache is indeed a cool cache,
# It doesn't support arbitrary loaders in a way that is consistent with my 
design,
# Either we can do one of these things (AFAICT):
## Use the working code I wrote (no guava cache)
## Change our goals, and use guava  (allow cores loading the same resource to 
all block each other until loading is done)
## Use guava and wrap it in additional loader management code of similar 
complexity as my original code.
# Weak/soft values require someone to hold the strong reference. 

New thought this morning: I could probably add methods and a list to the 
SolrCore object for the purpose of giving it a reference to the resource at 
load time, thus tying the life-cycle of the resource to the object we want it 
to live and die with. Then weak values would probably work fine.

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
> Attachments: SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2016-02-11 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142799#comment-15142799
 ] 

Gus Heck edited comment on SOLR-8349 at 2/11/16 2:32 PM:
-

I am not sure I understand what you mean by load the analysis component into 
the container cache. The analysis component in lucene needs some way to 
reference the container and ask for the shared resource. My assumption was that 
it was not ok to add dependencies to solr classes into Lucene. Otherwise I 
could just cast to SolrResourceLoader in the analysis component in question. 
Did you look at SOLR-3443 where I use this patch to implement the feature for 
HunspellStemFilterFactory.java. do this:

{code}
if (loader instanceof ContainerResourceSharing) {
resourceSharing = (ContainerResourceSharing) loader;
{code}

this then allows me to have a method like:
{code}
  public Dictionary getDictionary() {
return this.dictionary == null ? (Dictionary) 
resourceSharing.getContainerResource(resourceKey) : this.dictionary;
  }
{code}
If I can use a Solr Class, that could just as easily say 

{code}
if (loader instanceof SolrResourceLoader) {
resourceSharing = (SolrResourceLoader) loader;
{code}

One could also add methods to ResourceLoader itself, but then all forms of 
resource loader have to deal with implementing this method.


was (Author: gus_heck):
I am not sure I understand what you mean by load the analysis component into 
the container cache. The analysis component in lucene needs some way to 
reference the container and ask for the shared resource. My assumption was that 
it was not ok to add dependencies to solr dependencies into Lucene. Otherwise I 
could just cast to SolrResourceLoader in the analysis component in question. 
Did you look at SOLR-3443 where I use this patch to implement the feature for 
HunspellStemFilterFactory.java. do this:

{code}
if (loader instanceof ContainerResourceSharing) {
resourceSharing = (ContainerResourceSharing) loader;
{code}

this then allows me to have a method like:
{code}
  public Dictionary getDictionary() {
return this.dictionary == null ? (Dictionary) 
resourceSharing.getContainerResource(resourceKey) : this.dictionary;
  }
{code}
If I can use a Solr Class, that could just as easily say 

{code}
if (loader instanceof SolrResourceLoader) {
resourceSharing = (SolrResourceLoader) loader;
{code}

One could also add methods to ResourceLoader itself, but then all forms of 
resource loader have to deal with implementing this method.

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
> Attachments: SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8349) Allow sharing of large in memory data structures across cores

2015-12-02 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031082#comment-15031082
 ] 

Gus Heck edited comment on SOLR-8349 at 12/2/15 7:53 PM:
-

Patch implementing general feature (on 6.x). Patch for SOLR-3443 to be attached 
there shortly.


was (Author: gus_heck):
Patch implementing general feature. Patch for SOLR-3443 to be attached there 
shortly.

> Allow sharing of large in memory data structures across cores
> -
>
> Key: SOLR-8349
> URL: https://issues.apache.org/jira/browse/SOLR-8349
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 5.3
>Reporter: Gus Heck
> Attachments: SOLR-8349.patch
>
>
> In some cases search components or analysis classes may utilize a large 
> dictionary or other in-memory structure. When multiple cores are loaded with 
> identical configurations utilizing this large in memory structure, each core 
> holds it's own copy in memory. This has been noted in the past and a specific 
> case reported in SOLR-3443. This patch provides a generalized capability, and 
> if accepted, this capability will then be used to fix SOLR-3443.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org