Re: [Dspace-tech] Question about update-discovery-index -r
On Fri, Jan 10, 2014 at 10:30 PM, Terry Brady tw...@georgetown.edu wrote: I presumed that I could remove the documents for an entire community/collection in a single command. Sorry about the confusion, that didn't occur to me. A Solr index is just a collection of documents with no intrinsic concept of hierarchy. Is there any way we could write the help text more clearly? Perhaps it would be useful to have a new command line option to force the re-indexing of a specific community or collection. It might also be useful to allow multiple items to be deleted in a single command. If this sounds potentially useful to others, I will file an enhancement request. Sure, feel free to file one. Although I personally can't imagine a case where it would be useful. It sounds ad hoc - and if you need something ad hoc, it sounds more flexible to talk directly to Solr than to build it into DSpace. In your case, your query would look like: $ curl http://localhost:8080/solr/search/select/?q=location.coll:1234rows=0; Measure twice, cut once. $ curl http://localhost:8080/solr/search/update?stream.body=updatedeletequerylocation.coll:1234/query/deletecommit//update Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Question about update-discovery-index -r
Ivan, Your recommended solution looks simple and quick to me. I had not considered issuing my own transactions against the SOLR service. Given the simplicity of your recommendation, your recommendation seems preferable to adding complexity to the command-line interface. Looking at the command line help, the left side of the help is clear that an item handle is recommended. Based on my experience with the filter-media handle parameter, I presumed that passing a collection/community handle to this command might trigger the re-processing of the entire community/collection. Here are a couple possible alternatives for the help text. - remove an Item from the index based on its handle - remove a solr document from the index based on its handle Terry On Mon, Jan 13, 2014 at 3:33 AM, helix84 heli...@centrum.sk wrote: On Fri, Jan 10, 2014 at 10:30 PM, Terry Brady tw...@georgetown.edu wrote: I presumed that I could remove the documents for an entire community/collection in a single command. Sorry about the confusion, that didn't occur to me. A Solr index is just a collection of documents with no intrinsic concept of hierarchy. Is there any way we could write the help text more clearly? Perhaps it would be useful to have a new command line option to force the re-indexing of a specific community or collection. It might also be useful to allow multiple items to be deleted in a single command. If this sounds potentially useful to others, I will file an enhancement request. Sure, feel free to file one. Although I personally can't imagine a case where it would be useful. It sounds ad hoc - and if you need something ad hoc, it sounds more flexible to talk directly to Solr than to build it into DSpace. In your case, your query would look like: $ curl http://localhost:8080/solr/search/select/?q=location.coll:1234rows=0; Measure twice, cut once. $ curl http://localhost:8080/solr/search/update?stream.body= updatedeletequerylocation.coll:1234/query/deletecommit//update Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Lauinger Information Technology 202-687-7053 -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Question about update-discovery-index -r
On Thu, Jan 9, 2014 at 7:23 PM, Terry Brady tw...@georgetown.edu wrote: If I run this command, the log file indicates that my item has been removed, but I am unable to detect an impact caused by running this option. Looking at IndexClient.java and SolrServiceImpl.java I notice that the change is not committed in SOLR after it is made. Hi Terry, it's entirely possible that it is so. Commits affect performance heavily, so they are delayed whenever possible to run as many changes in batch as reasonable using autocommit. You can see the default values for DSpace here: https://github.com/DSpace/DSpace/blob/dspace-3.1/dspace/solr/search/conf/solrconfig.xml#L299 If you want to force a commit manually, it's as easy as running: curl http://localhost:8080/solr/search/update?stream.body=commit/ On both of these occasions, I had to force the re-index of my entire repository (update-discovery-index -f). That shouldn't be necessary. Once the Solr document for a particular handle is missing, update-discovery-index without the -f parameter should add it, just as if it were a newly created item. Let me know whether you got it working. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Question about update-discovery-index -r
Ivan, Thank you for the additional details. Today, with a bit more time and patience, I noticed that if I remove individual items from the SOLR index, my facets do update appropriately. The command line help was a bit misleading to me. -r item handle remove an Item, Collection or Community from index based on its handle I presumed that I could remove the documents for an entire community/collection in a single command. Perhaps it would be useful to have a new command line option to force the re-indexing of a specific community or collection. It might also be useful to allow multiple items to be deleted in a single command. If this sounds potentially useful to others, I will file an enhancement request. Terry On Fri, Jan 10, 2014 at 4:52 AM, helix84 heli...@centrum.sk wrote: On Thu, Jan 9, 2014 at 7:23 PM, Terry Brady tw...@georgetown.edu wrote: If I run this command, the log file indicates that my item has been removed, but I am unable to detect an impact caused by running this option. Looking at IndexClient.java and SolrServiceImpl.java I notice that the change is not committed in SOLR after it is made. Hi Terry, it's entirely possible that it is so. Commits affect performance heavily, so they are delayed whenever possible to run as many changes in batch as reasonable using autocommit. You can see the default values for DSpace here: https://github.com/DSpace/DSpace/blob/dspace-3.1/dspace/solr/search/conf/solrconfig.xml#L299 If you want to force a commit manually, it's as easy as running: curl http://localhost:8080/solr/search/update?stream.body=commit/ On both of these occasions, I had to force the re-index of my entire repository (update-discovery-index -f). That shouldn't be necessary. Once the Solr document for a particular handle is missing, update-discovery-index without the -f parameter should add it, just as if it were a newly created item. Let me know whether you got it working. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Lauinger Information Technology 202-687-7053 -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] Question about update-discovery-index -r
I have a question about the following option in update-discovery-index: -r item handle remove an Item, Collection or Community from index based on its handle On two occasions, I have wanted to force the re-indexing of a specific collection/community (1) After customizing the facets for a specific collection (2) After moving a collection from one community to another If I run this command, the log file indicates that my item has been removed, but I am unable to detect an impact caused by running this option. Looking at IndexClient.java and SolrServiceImpl.java I notice that the change is not committed in SOLR after it is made. On both of these occasions, I had to force the re-index of my entire repository (update-discovery-index -f). Is it possible to accomplish what I am trying to achieve without rebuilding the entire index? I am running DSpace 3.1. Thanks, Terry -- Terry Brady Applications Programmer Analyst Lauinger Information Technology 202-687-7053 -- CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette