Re: [Dspace-tech] How to stop using full text search?

2014-02-07 Thread helix84
On Thu, Feb 6, 2014 at 11:19 PM, Calloni, Rodrigo rcall...@iadb.org wrote:
 I am getting Internal Server Error when I run the index after cleaning up the 
 index and changing the schema.xml

 When I add -b the error message doesn't show up.

So, I understand that it's working when you reindex with -b, right?


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


[Dspace-tech] How to stop using full text search?

2014-02-06 Thread Calloni, Rodrigo
Hello

We are in DSpace 3.2 XMLUI.

I am trying to stop our DSpace from using the full text extracted by 
filter-media in the search.

I already commented out the filter.plugins from dspace.org but I don't think 
that is enough.

Is there a configuration that we can do to define which fields will be indexed 
for the search?

Thanks in advance
Rodrigo

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] How to stop using full text search?

2014-02-06 Thread helix84
On Thu, Feb 6, 2014 at 3:56 PM, Calloni, Rodrigo rcall...@iadb.org wrote:
 I already commented out the filter.plugins from dspace.org but I don't think 
 that is enough.

The Solr index still contains the previously extracted text. You have
to rebuild the Solr index:

[dspace]/bin/dspace update-discovery-index -b


 Is there a configuration that we can do to define which fields will be 
 indexed for the search?

Yes, you can comment out the fulltext fields from
[dspace]/solr/search/conf/schema.xml before you reindex.

Alternatively, since you already disabled the filters, you can just
remove the extracted text files from the TEXT bundles and rebuild the
index - thus there will be nothing to add to the index. This approach
also ensures that you didn't leave any extracted bitstreams in TEXT
bundles lying around, potentially exposed anonymously (I don't
remember off-hand whether these are accessible via HTTP).


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] How to stop using full text search?

2014-02-06 Thread helix84
OK, I don't know why, but perhaps the -b flag isn't removing the old
contents. Try removing the index manually:

cp -r [dspace]/solr/search/data/index [dspace]/solr/search/data/index.bak
rm [dspace]/solr/search/data/index/*

And then recreate it with the new schema using:
[dspace]/bin/dspace update-discovery-index

You should be able to verify that these fields are gone from the index
by going to the Solr admin UI [1]. Actually, do this even now before
you execute the above steps, just to confirm the problem.

[1] https://wiki.duraspace.org/display/DSPACE/Solr


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] How to stop using full text search?

2014-02-06 Thread Calloni, Rodrigo
Thanks Ivan. When I commented out the fulltext lines on schema.xml and the 
reindex finished very quickly and the search stopped.

!--field name=text type=text indexed=true stored=false 
multiValued=true/ --

!--field name=fulltext type=text indexed=true stored=true 
multiValued=true/ --

I restored the lines and the reindex finished ok. But full text is still being 
used.

Any ideas?

Rodrigo


-Original Message-
From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84
Sent: Thursday, February 06, 2014 10:52 AM
To: Calloni, Rodrigo
Cc: dspace-tech
Subject: Re: [Dspace-tech] How to stop using full text search?

On Thu, Feb 6, 2014 at 3:56 PM, Calloni, Rodrigo rcall...@iadb.org wrote:
 I already commented out the filter.plugins from dspace.org but I don't think 
 that is enough.

The Solr index still contains the previously extracted text. You have to 
rebuild the Solr index:

[dspace]/bin/dspace update-discovery-index -b


 Is there a configuration that we can do to define which fields will be 
 indexed for the search?

Yes, you can comment out the fulltext fields from 
[dspace]/solr/search/conf/schema.xml before you reindex.

Alternatively, since you already disabled the filters, you can just remove the 
extracted text files from the TEXT bundles and rebuild the index - thus there 
will be nothing to add to the index. This approach also ensures that you didn't 
leave any extracted bitstreams in TEXT bundles lying around, potentially 
exposed anonymously (I don't remember off-hand whether these are accessible via 
HTTP).


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette 
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette


Re: [Dspace-tech] How to stop using full text search?

2014-02-06 Thread Calloni, Rodrigo
Thanks again,

I am getting Internal Server Error when I run the index after cleaning up the 
index and changing the schema.xml

[dspace@ip-172-31-28-251 bin]$ ./dspace update-discovery-index
 INFO [main] (DSpaceKernelInit.java:52) - Created new kernel: 
DSpaceKernel:org.dspace:name=23dd0b1b-b577-456b-a835-3e81125b28c9,type=DSpaceKernel:lastLoad=null:loadTime=0:running=false:kernel=null
 INFO [main] (ConfigurationManager.java:1217) - Loading from classloader: 
file:/home/dspace/dspace/config/dspace.cfg
 INFO [main] (ConfigurationManager.java:1217) - Using dspace provided log 
configuration (log.init.config)
 INFO [main] (ConfigurationManager.java:1217) - Loading: 
/home/dspace/dspace/config/log4j.properties
Exception: Error executing query
org.dspace.discovery.SearchServiceException: Error executing query
at 
org.dspace.discovery.SolrServiceImpl.cleanIndex(SolrServiceImpl.java:418)
at org.dspace.discovery.IndexClient.main(IndexClient.java:119)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing 
query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:266)
at 
org.dspace.discovery.SolrServiceImpl.getSolr(SolrServiceImpl.java:106)
at 
org.dspace.discovery.SolrServiceImpl.cleanIndex(SolrServiceImpl.java:388)
... 6 more
Caused by: org.apache.solr.common.SolrException: Internal Server Error

Internal Server Error

request: http://localhost:8080/solr/search/select?q=search.resourcetype:2 AND 
search.resourceid:1wt=javabinversion=2
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:432)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:246)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
... 9 more

When I add -b the error message doesn't show up.

Rodrigo

-Original Message-
From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84
Sent: Thursday, February 06, 2014 11:51 AM
To: Calloni, Rodrigo
Cc: dspace-tech
Subject: Re: [Dspace-tech] How to stop using full text search?

OK, I don't know why, but perhaps the -b flag isn't removing the old contents. 
Try removing the index manually:

cp -r [dspace]/solr/search/data/index [dspace]/solr/search/data/index.bak
rm [dspace]/solr/search/data/index/*

And then recreate it with the new schema using:
[dspace]/bin/dspace update-discovery-index

You should be able to verify that these fields are gone from the index by going 
to the Solr admin UI [1]. Actually, do this even now before you execute the 
above steps, just to confirm the problem.

[1] https://wiki.duraspace.org/display/DSPACE/Solr


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette 
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
--
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231iu=/4140/ostg.clktrk
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette