Re: [Dspace-tech] Difference between Lucene and Solr Indexing Commands
On Thu, Nov 28, 2013 at 3:01 AM, David Cook dc...@prosentient.com.au wrote: 1) Is it necessary to run index-update and/or update-discovery-index as cronjobs? The indexes and browse tables appear to be updated automatically (notwithstanding that cocoon caching issue), so it doesn't seem necessary. Ideally, it shouldn't be necessary. However, there may be situations that DSpace doesn't handle correctly, your 3) is likely one of them. It doesn't hurt to run them once a day from a cronjob, if there are no new changes, they should take relatively shortly. The other use case is if you want to modify contents directly in database, then you have to run them manually to update the index. I imagine index-update should only need to be run after changes to the indexing structure have occurred? Depends on what you mean by indexing structure. When adding/removing indexed fields, you have to run index-init / update-discovery-index -b. index-update and update-discovery-index only add/remove new records. Think of it as columns vs. rows. Also, update-discovery-index might only need to be run regularly (i.e. daily) with the optimize option? It is recommended to run the optimize option once in a while, especially on the Solr statistics core if you use it. PS: Withdrawn items still appear in Solr usage statistics in the DSpace UI but they show a number (perhaps their internal UID?) instead of their title. Is this intentional or should withdrawn items not appear in the usage statistics? I think it's debatable. Please, file a Jira issue to make sure the developers will discuss it in the future. 3) As for the cocoon cache, I've followed your suggestion of using index-update, but it hasn't changed anything. Any time I change the title of an item, the old title is presented when browsing (in the XMLUI - no the JSPUI) but is updated everywhere else. This behaviour is identical even when using the SolrBrowseDAO. So far, the only way I've gotten the correct title to show in the browse is to clear the cache or to add a new item (which must trigger a new page to be created and cached). OK, as I said, this might be an oversight. Please, file a Jira issue. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] Upgrade Solr 3.5 to 4.1
Hi, Now I have 3.5 version of Solr installed with DSpace 3.2 + CRIS module and I want to upgrade to 4.1 version, because it's a pre-requisit of DSpace + CRIS module. I downloaded Solr from: http://archive.apache.org/dist/lucene/solr/4.1.0/solr-4.1.0.zip I don't know which files of Solr modify or copy into DSpace, so I need your help to upgrade Solr succesfully. Thanks in advance http://archive.apache.org/dist/lucene/solr/4.1.0/solr-4.1.0.zip -- __ / / Rubén Boada Navarrete C E / S / C A Tècnic de Portals i Repositoris /_/ Centre de Serveis Científics i Acadèmics de Catalunya Gran Capità, 2-4 (Edifici Nexus) . 08034 Barcelona T. 93 205 6464 (ext. 302) . F. 93 551 62 13 . rbo...@cesca.cat Subscriu-te al butlletí (www.cesca.cat/butlleti) -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Can't add project in DSpace with module CRIS
Hi Pascarelli, Sorry for to many questions, but I have problems when upgrade my Solr. I have DSpace 3.X + CRIS Module (I download from https://github.com/Cineca/dspace-cris/) but for error a member of my team (that now is not here) put Solr 3.5 and not Solr 4.1.0 :( So, I want to migrate to Solr 4.1.0. I do these steps: 1) Download Solr 4.1.0: http://archive.apache.org/dist/lucene/solr/4.1.0/solr-4.1.0.zip 2) Unzip 3) Copy war /solr4/solr-4.1.0/dist/solr-4.1.0.war to my Tomcat webapps. 4) Deploy war in context '/solr' 5) I check that in 'dspace.cfg' the configuration of server of solr it's OK. default.solr.server = ${dspace.baseUrl}/solr 6) Edit '/dspace/webapps/solr/WEB-INF/web.xml' and uncoment /edit: env-entry env-entry-namesolr/home/env-entry-name env-entry-value/dades/dspace/solr/env-entry-value env-entry-typejava.lang.String/env-entry-type /env-entry The directory that I install dspace is '/dades/dspace' 7) Restart Tomcat but when I'm going to: http://portalrecerca-dev.cesca.cat/solr/ I get error: collection1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load config for solrconfig.xml 8) I try to execute /dspace/bin .dspace update-discovery-index but I get an error too: What I'm doing wrong? :( Very thanks in advance!! INFO [main] (DSpaceKernelInit.java:52) - Created new kernel: DSpaceKernel:org.dspace:name=5017caa7-a37a-4d1e-b1f4-3e2110836394,type=DSpaceKernel:lastLoad=null:loadTime=0:running=false:kernel=null INFO [main] (ConfigurationManager.java:1233) - Loading from classloader: file:/dades/dspace/config/dspace.cfg INFO [main] (ConfigurationManager.java:1233) - Using dspace provided log configuration (log.init.config) INFO [main] (ConfigurationManager.java:1233) - Loading: /dades/dspace/config/log4j.properties Exception: null java.lang.NullPointerException at org.dspace.app.cris.discovery.CrisSearchService.indexProperty(CrisSearchService.java:457) at org.dspace.app.cris.discovery.CrisSearchService.indexCrisObject(CrisSearchService.java:313) at org.dspace.app.cris.discovery.CrisSearchService.createCrisIndex(CrisSearchService.java:670) at org.dspace.app.cris.discovery.CrisSearchService.createCrisIndex(CrisSearchService.java:357) at org.dspace.app.cris.discovery.CrisSearchService.updateCrisIndex(CrisSearchService.java:366) at org.dspace.app.cris.discovery.CrisSearchService.updateIndex(CrisSearchService.java:196) at org.dspace.discovery.IndexClient.main(IndexClient.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183) Daniel El 25/11/13 18:39, Pascarelli Luigi Andrea ha escrit: Dear Daniel, to use DSpace-CRIS is mandatory migrate to Solr 4.1.0. To rebuild solr data index in the right format for the 4.1.0, you can call the optimize command directly to the solr server: http://[hostname:port]/solr/[solrcore]/update?optimize=true DSpace-CRIS have 5 solr core (3 derived from DSpace - 1 core for network feature - 1 for pubmed feature) so for example you can call via http (open your browser or use wget or curl tool) the follow link: http://localhost:8080/solr/search/update?optimize=true http://localhost:8080/solr/statistics/update?optimize=true http://localhost:8080/solr/oai/update?optimize=true http://localhost:8080/solr/network/update?optimize=true http://localhost:8080/solr/pmc/update?optimize=true If your installation of DSpace-CRIS is derived from an old DSpace installation (1.8) then before to migrate data index to 4.1.0 you need to optimize index with the optimize implemented on solr server 3.5.0(after that you can turn off this and migrate to 4.1.0 optimize core again). For other details see https://jira.duraspace.org/browse/DS-1776 BTW for the solr search core you can remove manually the index data directory [dspace]/solr/search/data and run from [dspace]/bin the "dspace update-discovery-index" command (recall that solr server must be up and running before launch indexer). After this you can rebuild also index data for network feature running "dspace dsrun org.dspace.app.cris.batch.ScriptIndexNetwork -a" or if you have interest
Re: [Dspace-tech] Driver and Openaire sets under request URL
Hi Ruiz, I've already had some discussion around that issue. Yes, it's possible to have those sets available in the default request context. However, the Driver and OpenAIRE projects have specific guidelines witch force OAI data providers to exposed data values in a distinct fashion way when compared with the default DSpace responses. For example, the Driver guidelines demand for a different date format, also there are some mandatory prefixes for the dc.type and dc.rights. When it comes to OpenAIRE there are some subtle differences against the default and Driver responses too. So, again, you could have this sets defined in the default context, but is that what you really want? If you want so, within the definition of the set (xoai.xml) it's possible to declare the list of associated filters, which would be the same defined in the driver and openaire contexts respectively. If you need further help, just ask, is just a configuration change. On 28 November 2013 09:12, RUIZ MORENO, ROBERT robert.r...@upf.edu wrote: Hi everyone, I'm using DSpace 3.1 XMLUI. I'm experimenting with the new xoai OAI 2.0 implementation. As I've seen, now the implementation is DRIVER and OPENAIRE compliant, and thus exists three availables url's: http://oai-url/request http://oai-url/driver http://oai-url/openaire Driver generates driver set and, openaire generates ec_fundedresources set, but those sets only are available under driver and openaire url's, respectively, but not under request url. My wish is that driver and openaire sets were available under request url. How can accomplish that feature? Thanks in advance. Robert Ruiz -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Thanks, João Melo (My Portfolio http://www.lyncode.com/m/jmelo/) DSpace Department *Lyncode*: Official websitehttp://www.google.com/url?q=http%3A%2F%2Fwww.lyncode.com%2Fsa=Dsntz=1usg=AFrqEzdV8iS6rMxflxnn138XReuRfUG3OQ [image: Follow us on Facebook]http://www.google.com/url?q=http%3A%2F%2Ftwitter.com%2Flyncodesa=Dsntz=1usg=AFrqEzeDuT3ZqMW5uVIA8AoxtTtAeiCX3Q http://www.google.com/url?q=http%3A%2F%2Fwww.facebook.com%2Flyncodesa=Dsntz=1usg=AFrqEzcWXjHa3gKBGLsNVxktapxkiWDnww -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] R: Difference between Lucene and Solr Indexing Commands
Hi David, as you says that you are building a repository primary for technical test and you want learn more about the way that dspace work I want to suggest you to take a look to the upcoming dspace 4. You can build it from the master, use the rc1 tag o wait for the rc2 and final release in December, see https://wiki.duraspace.org/display/DSPACE/DSpace+Release+4.0+Notes Please note two major changes in dspace 4: 1) SOLR/discovery will be the default search provider and lucene is likely to be deprecate 2) the JSPUI look feel has ben fully redesigned. Take a look at it on the demo server: http:// demo.dspace.org/jspui Andrea Inviato da Samsung Mobile Messaggio originale Da: helix84 heli...@centrum.sk Data:28/11/2013 09:51 (GMT+01:00) A: David Cook dc...@prosentient.com.au Cc: dspace-tech dspace-tech@lists.sourceforge.net Oggetto: Re: [Dspace-tech] Difference between Lucene and Solr Indexing Commands On Thu, Nov 28, 2013 at 3:01 AM, David Cook dc...@prosentient.com.au wrote: 1) Is it necessary to run index-update and/or update-discovery-index as cronjobs? The indexes and browse tables appear to be updated automatically (notwithstanding that cocoon caching issue), so it doesn't seem necessary. Ideally, it shouldn't be necessary. However, there may be situations that DSpace doesn't handle correctly, your 3) is likely one of them. It doesn't hurt to run them once a day from a cronjob, if there are no new changes, they should take relatively shortly. The other use case is if you want to modify contents directly in database, then you have to run them manually to update the index. I imagine index-update should only need to be run after changes to the indexing structure have occurred? Depends on what you mean by indexing structure. When adding/removing indexed fields, you have to run index-init / update-discovery-index -b. index-update and update-discovery-index only add/remove new records. Think of it as columns vs. rows. Also, update-discovery-index might only need to be run regularly (i.e. daily) with the optimize option? It is recommended to run the optimize option once in a while, especially on the Solr statistics core if you use it. PS: Withdrawn items still appear in Solr usage statistics in the DSpace UI but they show a number (perhaps their internal UID?) instead of their title. Is this intentional or should withdrawn items not appear in the usage statistics? I think it's debatable. Please, file a Jira issue to make sure the developers will discuss it in the future. 3) As for the cocoon cache, I've followed your suggestion of using index-update, but it hasn't changed anything. Any time I change the title of an item, the old title is presented when browsing (in the XMLUI - no the JSPUI) but is updated everywhere else. This behaviour is identical even when using the SolrBrowseDAO. So far, the only way I've gotten the correct title to show in the browse is to clear the cache or to add a new item (which must trigger a new page to be created and cached). OK, as I said, this might be an oversight. Please, file a Jira issue. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Upgrade Solr 3.5 to 4.1
Hi Ruben, you have to deploy a fresh Solr webapp into your tomcat. So, unzip the package and find the solr.war (usually is under dist folder). Deploy war in context solr/ (via tomcat manager or copy war under tomcat webapps directory) and set environment variable in tomcat context.xml file like: Parameter name="log4j.configuration" value="[dspace-config]/log4j-solr.properties" override="false"/ Parameter name="LocalhostRestrictionFilter.localhost" value="false" override="false"/ Environment name="solr/home" value="[dspace.dir]/solr" type="java.lang.String" override="false"/ Startup tomcat and check the correct installation in your browser at http://localhost:8080/solr/search I assert that you have done a default cris installation and the url of solr server is http://localhost:8080/solr and the context container is the same of your DSpace webapp, to deploy a war google knows more that me :-) Hope this help. Luigi Andrea Il 28/11/2013 12:51, Ruben ha scritto: Hi, Now I have 3.5 version of Solr installed with DSpace 3.2 + CRIS module and I want to upgrade to 4.1 version, because it's a pre-requisit of DSpace + CRIS module. I downloaded Solr from: http://archive.apache.org/dist/lucene/solr/4.1.0/solr-4.1.0.zip I don't know which files of Solr modify or copy into DSpace, so I need your help to upgrade Solr succesfully. Thanks in advance -- __ / / Rubn Boada Navarrete C E / S / C A Tcnic de Portals i Repositoris /_/ Centre de Serveis Cientfics i Acadmics de Catalunya Gran Capit, 2-4 (Edifici Nexus) 08034 Barcelona T. 93 205 6464 (ext. 302) F. 93 551 62 13 rbo...@cesca.cat Subscriu-te al butllet (www.cesca.cat/butlleti) -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Luigi Andrea Pascarelli Dipartimento Servizi e Soluzioni per l'Amministrazione Universitaria Divisione Ricerca Via dei Tizii, 6 00185 Roma, Italy ph. +39 06 59292895 http://www.cineca.it -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] Filtering by values in a new table
Hi all, I'm working on a fairly modified implementation of DSpace (version 3.2 on Postgresql) and one of the features we are implementing is to update the browse list so that users can select whether or not to show items that have been deemed 'past their sell by date'. So the general idea is that I have created an additional table that stores the id of the item and an integer value. When this value goes lower than a certain point, we don't want that item to show up in browse lists (unless the user selects an option to show it). Now in SQL this is fairly simple to do - we either use an SQL WHERE item_id IN ( SELECT item_id FROM new_table WHERE value 0), or by simply joining the tables and then doing the evaluator. Now, the actual list of items obtained through the browse page goes through a number of classes and interfaces that make it very difficult (if not impossible) to do this. Am I missing something or do I need to do a huge amount of redevelopment to be able to actually do this? I've basically spent the whole day so far going around in circles on something that we thought would be fairly simple to do... Thanks, Adam Rousell This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system, you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. -- Rapidly troubleshoot problems before they affect your business. Most IT organizations don't have a clear picture of how application performance affects their revenue. With AppDynamics, you get 100% visibility into your Java,.NET, PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! http://pubads.g.doubleclick.net/gampad/clk?id=84349351iu=/4140/ostg.clktrk___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette