Re: [Dspace-tech] How to stop using full text search?
On Thu, Feb 6, 2014 at 11:19 PM, Calloni, Rodrigo rcall...@iadb.org wrote: I am getting Internal Server Error when I run the index after cleaning up the index and changing the schema.xml When I add -b the error message doesn't show up. So, I understand that it's working when you reindex with -b, right? Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] How to stop using full text search?
Hello We are in DSpace 3.2 XMLUI. I am trying to stop our DSpace from using the full text extracted by filter-media in the search. I already commented out the filter.plugins from dspace.org but I don't think that is enough. Is there a configuration that we can do to define which fields will be indexed for the search? Thanks in advance Rodrigo -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] How to stop using full text search?
On Thu, Feb 6, 2014 at 3:56 PM, Calloni, Rodrigo rcall...@iadb.org wrote: I already commented out the filter.plugins from dspace.org but I don't think that is enough. The Solr index still contains the previously extracted text. You have to rebuild the Solr index: [dspace]/bin/dspace update-discovery-index -b Is there a configuration that we can do to define which fields will be indexed for the search? Yes, you can comment out the fulltext fields from [dspace]/solr/search/conf/schema.xml before you reindex. Alternatively, since you already disabled the filters, you can just remove the extracted text files from the TEXT bundles and rebuild the index - thus there will be nothing to add to the index. This approach also ensures that you didn't leave any extracted bitstreams in TEXT bundles lying around, potentially exposed anonymously (I don't remember off-hand whether these are accessible via HTTP). Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] How to stop using full text search?
OK, I don't know why, but perhaps the -b flag isn't removing the old contents. Try removing the index manually: cp -r [dspace]/solr/search/data/index [dspace]/solr/search/data/index.bak rm [dspace]/solr/search/data/index/* And then recreate it with the new schema using: [dspace]/bin/dspace update-discovery-index You should be able to verify that these fields are gone from the index by going to the Solr admin UI [1]. Actually, do this even now before you execute the above steps, just to confirm the problem. [1] https://wiki.duraspace.org/display/DSPACE/Solr Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] How to stop using full text search?
Thanks Ivan. When I commented out the fulltext lines on schema.xml and the reindex finished very quickly and the search stopped. !--field name=text type=text indexed=true stored=false multiValued=true/ -- !--field name=fulltext type=text indexed=true stored=true multiValued=true/ -- I restored the lines and the reindex finished ok. But full text is still being used. Any ideas? Rodrigo -Original Message- From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84 Sent: Thursday, February 06, 2014 10:52 AM To: Calloni, Rodrigo Cc: dspace-tech Subject: Re: [Dspace-tech] How to stop using full text search? On Thu, Feb 6, 2014 at 3:56 PM, Calloni, Rodrigo rcall...@iadb.org wrote: I already commented out the filter.plugins from dspace.org but I don't think that is enough. The Solr index still contains the previously extracted text. You have to rebuild the Solr index: [dspace]/bin/dspace update-discovery-index -b Is there a configuration that we can do to define which fields will be indexed for the search? Yes, you can comment out the fulltext fields from [dspace]/solr/search/conf/schema.xml before you reindex. Alternatively, since you already disabled the filters, you can just remove the extracted text files from the TEXT bundles and rebuild the index - thus there will be nothing to add to the index. This approach also ensures that you didn't leave any extracted bitstreams in TEXT bundles lying around, potentially exposed anonymously (I don't remember off-hand whether these are accessible via HTTP). Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] How to stop using full text search?
Thanks again, I am getting Internal Server Error when I run the index after cleaning up the index and changing the schema.xml [dspace@ip-172-31-28-251 bin]$ ./dspace update-discovery-index INFO [main] (DSpaceKernelInit.java:52) - Created new kernel: DSpaceKernel:org.dspace:name=23dd0b1b-b577-456b-a835-3e81125b28c9,type=DSpaceKernel:lastLoad=null:loadTime=0:running=false:kernel=null INFO [main] (ConfigurationManager.java:1217) - Loading from classloader: file:/home/dspace/dspace/config/dspace.cfg INFO [main] (ConfigurationManager.java:1217) - Using dspace provided log configuration (log.init.config) INFO [main] (ConfigurationManager.java:1217) - Loading: /home/dspace/dspace/config/log4j.properties Exception: Error executing query org.dspace.discovery.SearchServiceException: Error executing query at org.dspace.discovery.SolrServiceImpl.cleanIndex(SolrServiceImpl.java:418) at org.dspace.discovery.IndexClient.main(IndexClient.java:119) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183) Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:266) at org.dspace.discovery.SolrServiceImpl.getSolr(SolrServiceImpl.java:106) at org.dspace.discovery.SolrServiceImpl.cleanIndex(SolrServiceImpl.java:388) ... 6 more Caused by: org.apache.solr.common.SolrException: Internal Server Error Internal Server Error request: http://localhost:8080/solr/search/select?q=search.resourcetype:2 AND search.resourceid:1wt=javabinversion=2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:432) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) ... 9 more When I add -b the error message doesn't show up. Rodrigo -Original Message- From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84 Sent: Thursday, February 06, 2014 11:51 AM To: Calloni, Rodrigo Cc: dspace-tech Subject: Re: [Dspace-tech] How to stop using full text search? OK, I don't know why, but perhaps the -b flag isn't removing the old contents. Try removing the index manually: cp -r [dspace]/solr/search/data/index [dspace]/solr/search/data/index.bak rm [dspace]/solr/search/data/index/* And then recreate it with the new schema using: [dspace]/bin/dspace update-discovery-index You should be able to verify that these fields are gone from the index by going to the Solr admin UI [1]. Actually, do this even now before you execute the above steps, just to confirm the problem. [1] https://wiki.duraspace.org/display/DSPACE/Solr Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette