RE: Indexing from Nutch crawl

2011-04-18 Thread McGibbney, Lewis John
)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2011-04-18 11:27:11,033 ERROR solr.SolrIndexer - java.io.IOException: Job 
failed!
2011-04-18 11:27:11,869 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
starting at 2011-04-18 11:27:11
2011-04-18 11:27:11,870 INFO  solr.SolrDeleteDuplicates - SolrDeleteDuplicates: 
Solr url: http://localhost:8080/wombra/data
2011-04-18 11:27:13,048 INFO  solr.SolrClean - SolrClean: starting at 
2011-04-18 11:27:13
2011-04-18 11:27:13,888 INFO  solr.SolrClean - SolrClean: deleting 5 documents
2011-04-18 11:27:13,992 WARN  mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Not Found

Not Found

request: http://localhost:8080/wombra/data/update?wt=javabinversion=1
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at 
org.apache.nutch.indexer.solr.SolrClean$SolrDeleter.close(SolrClean.java:115)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:473)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)



From: Markus Jelsma [markus.jel...@openindex.io]
Sent: 18 April 2011 11:59
To: solr-user@lucene.apache.org
Cc: McGibbney, Lewis John
Subject: Re: Indexing from Nutch crawl

Can you include hadoop.log output? Likely the other commands fail as well but
don't write the exception to stdout.


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Indexing from Nutch crawl

2011-04-18 Thread McGibbney, Lewis John
Hi Ramires,

I have been using Solr 1.4.1

My understanding from the example solrconfig.xml is that jar's will be loaded 
from the /lib directory. I do not have a /dist directory as I have copied the 
example directory as my solr home directory therefore I have commented out 
these entires in the solrconfig.xml.

Can you elaborate any on your comment below please as I may be missing your 
point.

Thank you Lewis



From: ramires [uy...@beriltech.com]
Sent: 18 April 2011 13:40
To: solr-user@lucene.apache.org
Subject: Re: Indexing from Nutch crawl

This is a problem of these files in nutch lib. You can easily change these
files with in solr dist directory.

 apache-solr-core-1.4.0.jar
apache-solr-solrj-1.4.0.jar


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-from-Nutch-crawl-tp2833862p2834270.html
Sent from the Solr - User mailing list archive at Nabble.com.

Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Implementing Facets

2011-03-21 Thread McGibbney, Lewis John
Hi list,

I am working with a Ajax-Solr GUI but I am getting the following error from 
Firebug when launching the web app on Tomcat 7.0.11. The web app uses Solr 
version 1.4.1

HTTP Status 400 - undefined field links/h1HR size=1 
noshade=noshadepbtype/b Status report/ppbmessage/b 
uundefined field links/u/ppbdescription/b uThe request sent by 
the client was syntactically incorrect (undefined field links).

The facet details, as configured in Ajax-Solr are as follows

  'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ],
  'facet.limit': 20,
  'facet.mincount': 1,
  'f.topics.facet.limit': 50,
  'f.countryCodes.facet.limit': -1,
  'facet.date': 'date',
  'facet.date.start': '1987-02-26T00:00:00.000Z/DAY',
  'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY',
  'facet.date.gap': '+1DAY',
  'json.nl': 'map'

I tried configuring the above by adding the following snippet to the dismax 
requestHandler in solrconfig.xml as follows

 str name=f.name.hl.alternateFieldname/str
 str name=f.text.hl.fragmenterregex/str !-- defined below --
 str name=facet.fieldtopics/str
 str name=facet.fieldorganisations/str
 str name=facet.fieldexchanges/str
 str name=facet.fieldcountryCodes/str
 str name=facet.limit20/str
 str name=facet.mincount1/str
 str name=f.topics.facet.limit.50/str
 str name=fcountryCodes.facet.limit.-1/str
 str name=facet.datedate/str
 str name=facet.date.start2000-01-01T00:00:00.000Z/DAY/str
 str name=facet.date.end2011-03-21T00:00:00.000Z/DAY+1DAY/str
 str name=facet.date.gap+1DAY/str

But I am still getting the error. I am not clear about how and where to 
configure the facet details. Can anyone suggest how I can properly implement 
the facets that I want as I am unsure.

Thank you kindly
Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Implementing Facets

2011-03-21 Thread McGibbney, Lewis John
Hi Ahmet,

Yes this is the case. I have changed it to reflect your suggestion thank you 
for this.

After reloading the app I still get the error, here is the full stack trace 
from catalina.out

INFO: [] Registered new searcher Searcher@8af0b0 main
21-Mar-2011 20:28:53 org.apache.solr.common.SolrException log
SEVERE: Exception during facet counts:org.apache.solr.common.SolrException: 
undefined field topics
at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1077)
at 
org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:226)
at 
org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:283)
at 
org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:166)
at 
org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:498)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:562)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:394)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:243)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:188)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

21-Mar-2011 20:28:53 org.apache.solr.core.SolrCore execute
INFO: [] webapp=/wombra path=/select 
params={json.wrf=jsonp1300739332983facet.date.start=1987-02-26T00:00:00.000Z/DAYfacet=truefacet.mincount=1facet.limit=20facet.date=datef.topics.facet.limit=50json.nl=mapwt=jsonq=*:*_=1300739333613facet.field=topicsfacet.field=organisationsfacet.field=exchangesfacet.field=countryCodesfacet.date.gap=%2B1DAYf.countryCodes.facet.limit=-1facet.date.end=1987-10-20T00:00:00.000Z/DAY%2B1DAY}
 hits=21 status=0 QTime=60

From: Ahmet Arslan [iori...@yahoo.com]
Sent: 21 March 2011 20:25
To: solr-user@lucene.apache.org
Subject: Re: Implementing Facets


Could it be missing dot in str name=fcountryCodes.facet.limit.-1/str?

str name=f.countryCodes.facet.limit-1/str?


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Using Solr 1.4.1 on most recent Tomcat 7.0.11

2011-03-17 Thread McGibbney, Lewis John
Hi François,

Thank you for your reply. I had made a simple mistake of including comments 
before
'?xml version=1.0 encoding=utf-8?', therefore I was getting a SAX error.
As you have correctly pointed out, it is not essential to include the snippet 
as above in the context file (if using one), however it might be useful to know 
that Tomcat 7 now validates XML files by default. In time I will get round to 
editing the wiki accordingly to mitigate against this in the future.

Thanks for looking in to this.

Lewis
___
From: François Schiettecatte [fschietteca...@gmail.com]
Sent: 17 March 2011 13:47
To: solr-user@lucene.apache.org
Subject: Re: Using Solr 1.4.1 on most recent Tomcat 7.0.11

Lewis

My update from tomcat 7.0.8 to 7.0.11 went with no hitches, I checked my 
context file and it does not have the xml preamble your has, specifically: 
'?xml version=1.0 encoding=utf-8?',


Here is my context file:

Context docBase=/home/omim/lib/java/apache-solr-4.0-2011-02-09_08-06-20.war 
debug=0 crossContext=true 
   Environment name=solr/home type=java.lang.String 
value=/home/omim/index/ override=true /
/Context
---

Hope this helps.

Cheers

François

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Faceting help

2011-03-16 Thread McGibbney, Lewis John
Hi Upayavira,

I use the term constraint to define additional options for a user to refine 
search with under each facet. If we could think
of them as sub facet's then maybe this would explain in slightly better terms.

I didn't add additional document source types in my original email but if I 
knew that there would be xls and doc contained within the
Solr index then these would also be added as sub facet's allowing a user to 
select prior to entering a search query.

Can you point me towards documentation or something similar in order to 
implement the above. I am aware that I have a lot more to
learn on faceted search, namely how to properly implement it!

Thank you Lewis

From: Upayavira [u...@odoko.co.uk]
Sent: 15 March 2011 22:42
To: solr-user@lucene.apache.org
Subject: Re: Faceting help

I'm not sure if I get what you are trying to achieve. What do you mean
by constraint?

Are you saying that you effectively want to filter the facets that are
returned?

e.g. for source field, you want to show html/pdf/email, but not, say xls
or doc?

Upayavira


 Topics  field
   Legislation  constraint
   Guidance/Policies  constraint
   Customer Service information/complaints procedure  constraint
   financial information  constraint
   etc etc

 Source  field
   html  constraint  constraint
   pdf  constraint
   email  constraint
   etc etc

 Date  field
 constraint

 Basically I need resources to understand how to implement the above
 instead of the example I currently have.
 Some guidance would be great
 Thank you kindly

 Lewis

 Glasgow Caledonian University is a registered Scottish charity, number
 SC021474

 Winner: Times Higher Education’s Widening Participation Initiative of the
 Year 2009 and Herald Society’s Education Initiative of the Year 2009.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

 Winner: Times Higher Education’s Outstanding Support for Early Career
 Researchers of the Year 2010, GCU as a lead with Universities Scotland
 partners.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

---
Enterprise Search Consultant at Sourcesense UK,
Making Sense of Open Source

Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: hierarchical faceting, SOLR-792 - confused on config

2011-03-16 Thread McGibbney, Lewis John
Hi,

This is also where I am having problems. I have not been able to understand 
very much on the wiki.
I do not understand how to configure the faceting we are referring to.
Although I know very little about this, I can't help but think that the wiki is 
quite clearly unaccurate by some way!

Any comments please
Lewis

From: kmf [kfole...@gmail.com]
Sent: 23 February 2011 17:10
To: solr-user@lucene.apache.org
Subject: Re: hierarchical faceting, SOLR-792 - confused on config

I'm really confused now.  Is this page completely out of date -
http://wiki.apache.org/solr/HierarchicalFaceting - as it seems to imply that
solr-792 is a form of hierarchical faceting. There are currently two
similar, non-competing, approaches to generating tree/hierarchical facets
from Solr: SOLR-64 and SOLR-792

To achieve hierarchical faceting, is the rule then that you form the
hierarchical facets using a transformer in the DIH and do nothing in
schema.xml or solrconfig.xml?   I seem to recall reading somewhere that
creating a copyField is needed.  Sorry for the entry level question but, I'm
still trying to understand how to configure solr to do hierarchical
faceting.

Thanks,
kmf
--
View this message in context: 
http://lucene.472066.n3.nabble.com/hierarchical-faceting-SOLR-792-confused-on-config-tp2556394p2561445.html
Sent from the Solr - User mailing list archive at Nabble.com.

Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Using Solr 1.4.1 on most recent Tomcat 7.0.11

2011-03-16 Thread McGibbney, Lewis John
Hello list,

Is anyone running Solr (in my case 1.4.1) on above Tomcat dist? In the
past I have been using guidance in accordance with
http://wiki.apache.org/solr/SolrTomcat#Installing_Solr_instances_under_Tomcat
but having upgraded from Tomcat 7.0.8 to 7.0.11 I am having problems
E.g.

INFO: Deploying configuration descriptor wombra.xml  This is my context
fragment
from /home/lewis/Downloads/apache-tomcat-7.0.11/conf/Catalina/localhost
16-Mar-2011 16:57:36 org.apache.tomcat.util.digester.Digester fatalError
SEVERE: Parse Fatal Error at line 4 column 6: The processing instruction
target matching [xX][mM][lL] is not allowed.
org.xml.sax.SAXParseException: The processing instruction target
matching [xX][mM][lL] is not allowed.
...
16-Mar-2011 16:57:36 org.apache.catalina.startup.HostConfig
deployDescriptor
SEVERE: Error deploying configuration descriptor wombra.xml
org.xml.sax.SAXParseException: The processing instruction target
matching [xX][mM][lL] is not allowed.
...
some more
...

My configuration descriptor is as follows
?xml version=1.0 encoding=utf-8?
Context docBase=/home/lewis/Downloads/wombra/wombra.war
crossContext=true
  Environment name=solr/home type=java.lang.String
value=/home/lewis/Downloads/wombra override=true/
/Context

Preferably I would upload a WAR file, but I have been working well with
the configuration I have been using up until now therefore I didn't
question change.
I am unfamiliar with the above errors. Can anyone please point me in the
right direction?

Thank you
Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: hierarchical faceting, SOLR-792 - confused on config

2011-03-16 Thread McGibbney, Lewis John
Hi Erik,

I have been reading about the progression of SOLR-792 into pivot faceting, 
however can you expand to comment on
where it is committed. Are you referring to trunk?
The reason I am asking is that I have been using 1.4.1 for some time now and 
have been thinking of upgrading to trunk... or branch

Thank you Lewis

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 16 March 2011 17:36
To: solr-user@lucene.apache.org
Subject: Re: hierarchical faceting, SOLR-792 - confused on config

Sorry, I missed the original mail on this thread

I put together that hierarchical faceting wiki page a couple of years ago when 
helping a customer evaluate SOLR-64 vs. SOLR-792 vs.other approaches.  Since 
then, SOLR-792 morphed and is committed as pivot faceting.  SOLR-64 spawned a 
PathTokenizer which is part of Solr now too.

Recently Toke updated that page with some additional info.  It's definitely not 
a how to page, and perhaps should get renamed/moved/revamped?  Toke?

Erik


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Faceting help

2011-03-15 Thread McGibbney, Lewis John
Hello list,

I'm trying to use facet's via widget's within Ajax-Solr. I have tried the wiki 
for general help on configuring facets and constraints and also attended the 
recent Lucidworks webinar on faceted search. Can anyone please direct me to 
some reading on how to formally configure facets for searching.

Currently my facets are configured as follows

  'facet.field': [ 'topics', 'organisations', 'exchanges', 'countryCodes' ],
  'facet.limit': 20,
  'facet.mincount': 1,
  'f.topics.facet.limit': 50,
  'f.countryCodes.facet.limit': -1,
  'facet.date': 'date',
  'facet.date.start': '1987-02-26T00:00:00.000Z/DAY',
  'facet.date.end': '1987-10-20T00:00:00.000Z/DAY+1DAY',
  'facet.date.gap': '+1DAY',
  'json.nl': 'map'

However I wish to change the fields to contain some constraints such as

Topics  field
  Legislation  constraint
  Guidance/Policies  constraint
  Customer Service information/complaints procedure  constraint
  financial information  constraint
  etc etc

Source  field
  html  constraint  constraint
  pdf  constraint
  email  constraint
  etc etc

Date  field
constraint

Basically I need resources to understand how to implement the above instead of 
the example I currently have.
Some guidance would be great
Thank you kindly

Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Text field not defined in Solr Schema?

2011-02-26 Thread McGibbney, Lewis John
Hello list,

I have recently been working on some JS (ajax solr) and when using Firebug I am 
alerted to an error within the JS file as below.
It immediately breaks on line 12 stating that 'doc.text' is undefined! Here is 
the code snippet.

10 AjaxSolr.theme.prototype.snippet = function (doc) {
11   var output = '';
12   if (doc.text.length  300) {
13 output += doc.dateline + ' ' + doc.text.substring(0, 300);
14 output += 'span style=display:none;' +
doc.text.substring(300);
15 output += '/span a href=# class=moremore/a';
16   }
17   else {
18 output += doc.dateline + ' ' + doc.text;
19   }
20   return output;
21 };

I have been advised that the problem might stem from my schema not defining a 
text field, however as my implementation of Solr
is currently geared to index docs from a Nutch web crawl I am using the Nutch 
schema. A snippet of the schema is below

schema name=nutch version=1.1
types
   ...
fieldType name=text class=solr.TextField
positionIncrementGap=100
analyzer
...
/types
fields
...
field name=content type=text stored=true indexed=true/
/fields
/schema

Can someone confirm if I require to add something similar to the following

fields
...
field name=text type=text stored=true indexed=true/
/fields

Then perform a fresh crawl and reindex so that the schema field is recognised 
by the JS snippet?

Also (sorry I apologise) from my reading on the Solr schema, I became intrigued 
in options for TextField... namely compressed
and compressThreshold. I understand that they are used hand in glove, however 
can anyone please explain what benefits compression
adds and what integer value should be appropriate for the latter option.

Any help would be great
Thank you Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Text field not defined in Solr Schema?

2011-02-26 Thread McGibbney, Lewis John
Thank you Markus,

I am wondering if anyone can comment on the latter question I posted regarding 
supporting TextField
or StrField with compression options. I understand the methodology behind 
configuring compressThreshold
to the field type definition (1st part of my schema) and adding individual 
options to the individual field definitions (2nd part of my schema),
my question regards any real benefits which can be gained when implemented in a 
'small/medium' Solr use case.

Thank you Lewis

From: Markus Jelsma [markus.jel...@openindex.io]
Sent: 26 February 2011 13:42
To: solr-user@lucene.apache.org
Cc: McGibbney, Lewis John
Subject: Re: Text field not defined in Solr Schema?

Yes, you need to add the field text of type Text or use content instead of
text.


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Solr Ajax

2011-02-19 Thread McGibbney, Lewis John
Hello list,

I'm in the process of trying to implement Ajax within my Solr-backed webapp I 
have
been reading both the Solrj wiki as well as the tutorial provided via
the google group and various info from the wiki page 
https://github.com/evolvingweb/ajax-solr/wiki

I have all solrj jar libraries available in my webapp /lib but I am
unsure as to what steps I take to configure the Solrj client. What do I need to 
configure to begin working with Solrj? I am unsure as to where to go and 
finding information on the wiki seems to be a non trivial task.

Any help would be great. Thanks

Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: Errors when implementing VelocityResponseWriter

2011-02-16 Thread McGibbney, Lewis John
Managed to get this working. Changed my solrconfig for the one provided in 
velocity dir, repackaged the war file and redeployed on tomcat.

Although this seems like a ridiculously obvious thing to do, I somehow 
overlooked the repackaging aspect, this was where the problem was.

Thanks for the help Erik

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 16 February 2011 08:06
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

Well, you need to specify a path, relative or absolute, that points to the 
directory where the Velocity JAR file resides.

I'm not sure, at this point, exactly what you're missing.  But it should be 
fairly straightforward.  Solr startup logs the libraries it loads, so maybe 
that is helpful info.

1.4.1 - does it support lib?  (I'm not sure off the top of my head)

Erik

On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote:

 Hi Erik thank you for the reply

 I have placed all velocity jar files in my /lib directory. As explained 
 below, I have added relevant configuration to solrconfig.xml, I am just 
 wondering if the config instructions in the wiki are missing something? Can 
 anyone advise on this.

 As you mentioned, my terminal output suggests that the VelocityResponseWriter 
 class is not present and therefore the velocity jar is not present... however 
 this is not the case.

 I have specified lib dir=./lib / in solrconfig.xml, is this enough or do 
 I need to use an exact path. I have already tried specifying an exact path 
 and it does not seem to work either.

 Thank you

 Lewis
 
 From: Erik Hatcher [erik.hatc...@gmail.com]
 Sent: 15 February 2011 06:48
 To: solr-user@lucene.apache.org
 Subject: Re: Errors when implementing VelocityResponseWriter

 looks like you're missing the Velocity JAR.  It needs to be in some Solr 
 visible lib directory.  With 1.4.1 you'll need to put it in solr-home/lib.  
 In later versions, you can use the lib elements in solrconfig.xml to point 
 to other directories.

Erik

 On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

 Hello List,

 I am currently trying to implement the above in Solr 1.4.1. Having moved 
 velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
 webapp /lib directory, then adding queryResponseWriter name=blah and 
 class=blah followed by the responseHandler specifics I am shown the 
 following terminal output. I also added lib dir=./lib / in solrconfig. 
 Can anyone suggest what I have not included in the config that is still 
 required?

 Thanks Lewis

 SEVERE: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.response.VelocityResponseWriter'
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
   at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:547)
   at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
   at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.response.VelocityResponseWriter
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307

RE: Errors when implementing VelocityResponseWriter

2011-02-15 Thread McGibbney, Lewis John
Hi Erik thank you for the reply

I have placed all velocity jar files in my /lib directory. As explained below, 
I have added relevant configuration to solrconfig.xml, I am just wondering if 
the config instructions in the wiki are missing something? Can anyone advise on 
this.

As you mentioned, my terminal output suggests that the VelocityResponseWriter 
class is not present and therefore the velocity jar is not present... however 
this is not the case.

I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I 
need to use an exact path. I have already tried specifying an exact path and it 
does not seem to work either.

Thank you

Lewis

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 15 February 2011 06:48
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

looks like you're missing the Velocity JAR.  It needs to be in some Solr 
visible lib directory.  With 1.4.1 you'll need to put it in solr-home/lib.  
In later versions, you can use the lib elements in solrconfig.xml to point to 
other directories.

Erik

On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

 Hello List,

 I am currently trying to implement the above in Solr 1.4.1. Having moved 
 velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
 webapp /lib directory, then adding queryResponseWriter name=blah and 
 class=blah followed by the responseHandler specifics I am shown the 
 following terminal output. I also added lib dir=./lib / in solrconfig. 
 Can anyone suggest what I have not included in the config that is still 
 required?

 Thanks Lewis

 SEVERE: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.response.VelocityResponseWriter'
at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
at org.apache.solr.core.SolrCore.init(SolrCore.java:547)
at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.response.VelocityResponseWriter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
... 21 more

 Glasgow Caledonian University is a registered Scottish charity, number 
 SC021474

 Winner: Times Higher Education’s Widening Participation Initiative of the 
 Year 2009 and Herald Society’s Education Initiative of the Year 2009.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

 Winner: Times Higher Education’s Outstanding Support for Early Career 
 Researchers of the Year 2010, GCU as a lead with Universities Scotland 
 partners.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Email has been scanned for viruses by Altman Technologies

RE: Errors when implementing VelocityResponseWriter

2011-02-15 Thread McGibbney, Lewis John
To add to this (which stupidly, I have not mentioned previously) I am using 
Tomcat 7.0.8 as my servlet container. I have a sneaking suspicion that this is 
what is causing the problem, but as per below, I am unsure as to a solution.

From: McGibbney, Lewis John [lewis.mcgibb...@gcu.ac.uk]
Sent: 15 February 2011 17:04
To: solr-user@lucene.apache.org
Subject: RE: Errors when implementing VelocityResponseWriter

Hi Erik thank you for the reply

I have placed all velocity jar files in my /lib directory. As explained below, 
I have added relevant configuration to solrconfig.xml, I am just wondering if 
the config instructions in the wiki are missing something? Can anyone advise on 
this.

As you mentioned, my terminal output suggests that the VelocityResponseWriter 
class is not present and therefore the velocity jar is not present... however 
this is not the case.

I have specified lib dir=./lib / in solrconfig.xml, is this enough or do I 
need to use an exact path. I have already tried specifying an exact path and it 
does not seem to work either.

Thank you

Lewis

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 15 February 2011 06:48
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

looks like you're missing the Velocity JAR.  It needs to be in some Solr 
visible lib directory.  With 1.4.1 you'll need to put it in solr-home/lib.  
In later versions, you can use the lib elements in solrconfig.xml to point to 
other directories.

Erik

On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

 Hello List,

 I am currently trying to implement the above in Solr 1.4.1. Having moved 
 velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
 webapp /lib directory, then adding queryResponseWriter name=blah and 
 class=blah followed by the responseHandler specifics I am shown the 
 following terminal output. I also added lib dir=./lib / in solrconfig. 
 Can anyone suggest what I have not included in the config that is still 
 required?

 Thanks Lewis

 SEVERE: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.response.VelocityResponseWriter'
at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
at org.apache.solr.core.SolrCore.init(SolrCore.java:547)
at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.response.VelocityResponseWriter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
... 21 more

 Glasgow Caledonian University is a registered Scottish charity, number 
 SC021474

 Winner: Times Higher Education’s Widening Participation Initiative

RE: Alternative to Solrj

2011-02-11 Thread McGibbney, Lewis John
Hi Erik,

This sounds much more like it. I have had a look at the wiki and it sounds like 
a logical approach to UI customisation. Thank you for this

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 11 February 2011 14:12
To: solr-user@lucene.apache.org
Subject: Re: Alternative to Solrj

Sounds like you just described the VelocityResponseWriter.  On trunk (or 3.x I 
believe), try out http://localhost:8983/solr/browse and look at what makes that 
tick.

Erik

On Feb 11, 2011, at 08:40 , McGibbney, Lewis John wrote:

 Hi list,

 I have been looking at an alternative UI config displaying retrieved results 
 from Solr after a query has been passed. At this point, I am not interested 
 in Solrj as all I wish to change is the default responseWriter (line 1007 of 
 Solrconfig). I've also noticed a snippet of default CSS code included in 
 /conf/xslt/example.xsl and understand that all response writers are located 
 in $SOLR_HOME/src/java/org/apache/solr/request and that the default is 
 XSLTResponseWriter.java.
 Basically I wish to keep code for the search UI as simple as possible 
 (ideally write a simple JSP and CSS ), however I now find that this 
 configuration is proving slightly more confusing in practice. My thinking is 
 as follows, write own responseWriter, include within it my CSS template then 
 specify the responseWriter in solrconfig along with the java class. Can 
 anyone advise me on this from their own experiences.

 Thank you

 Lewis

 Glasgow Caledonian University is a registered Scottish charity, number 
 SC021474

 Winner: Times Higher Education’s Widening Participation Initiative of the 
 Year 2009 and Herald Society’s Education Initiative of the Year 2009.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

 Winner: Times Higher Education’s Outstanding Support for Early Career 
 Researchers of the Year 2010, GCU as a lead with Universities Scotland 
 partners.
 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Fatal error when posting to Solr

2011-02-11 Thread McGibbney, Lewis John
Hi list,

Was attempting to check out the VelocityResponseWriter before I progress with 
customising it for my own usage, I seem to have opened a can of worms when 
posting documents to Solr. Using simple post command I get the following output.

lewis@lewis-01:~/Downloads/apache-solr-1.4.1/example/exampledocs$ java -jar 
post.jar *.pdf
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, 
other encodings are not currently supported
SimplePostTool: POSTing files to http://localhost:8983/solr/update..
SimplePostTool: POSTing file 
technical_handbook_2010_domestic_section_0_general.pdf
SimplePostTool: FATAL: Solr returned an error: 
Unexpected_character__code_37_in_prolog_expected___at_rowcol_unknownsource_11

In some projects (E.g. Nutch) I am aware that the distribution does not come 
with alll jar's and these are required to be downloaded separately, I know this 
is not the case with Solr though. I have also successfully committed a host of 
.pdf to Solr recently so I know that this is working fine. Checking my Solr 
logs nothing seems to be out of place!

Has anyone seen anything similar?

Thanks Lewis



Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RequestHandler code within 1.4.0 dist

2011-02-08 Thread McGibbney, Lewis John
Hello list,

I have been searching through 1.4.0 source for a standard requestHandler 
plug-in example. I understand that for my purposes, extending 
RequestHandlerBase is a starting point, however I was wondering if there is any 
examples of plug-ins which I can view such as those contained within /contrib. 
Initially my experience using plug-ins relates to those contained within 
/contrib folder in Solr, or /plugins folder in Nutch, but the structure does 
not seem to be the same in Solr.

Can anyone please help. Thank you

Lewis


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education's Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: DataImportHandler usage with RDF database

2011-02-04 Thread McGibbney, Lewis John
Hi Otis... thanks for your thoughts.

I don't think DIH can read from a triple store today.  It can read from a 
RDBMS,
RSS/Atom feeds, URLs, mail servers, maybe others...
Maybe what you should be looking at is the ManifoldCF instead, although I don't
think it can fetch data from triple stores today either.

Ok well a way I can work around this (for the time being) is to pull data from 
URL's instead.

 without sending an index commit to Solr. As far as I can see  
 DataImportHandler
currently supports full and delta imports which mean I would  be indexing.


 I don't follow what you mean by this and how it relates to the first part.

Well as you mentioned below, I'm talking about a custom SearchComponent that 
reads some data from
somewhere (URL for the time being) and then uses it at search time for
something. I have no need to index this data, I merely require it at search 
time.

 So far I have yet to find a requestHandler which is able to read  then store
data in memory, then use this data elsewhere prior to returning  documents via
queryResponseWriter.


I think you are talking about a custom SearchComponent that reads some data 
from
somewhere (e.g. your triple store) and then uses it at search time for
something.  This sounds doable, although you didn't provide details.  For
example, we (Sematext) have implemented custom SearchComponents for e-commerce
customers where frequently-changing information about product availability was
fetched from external stores and applied to search results.

I have web based files and the idea is to specify the URLs to the 
SearchComponent which can then use data within them during search time. Did 
your plug-in adhere to the general requestHandler design? Can you provide any 
resource from which I can get started with this?

thank you
Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


RE: value for maxFieldLength

2011-02-03 Thread McGibbney, Lewis John
Thank you Erick

Lewis


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 03 February 2011 13:25
To: solr-user@lucene.apache.org
Subject: Re: value for maxFieldLength

This is not really vary large, Solr should handle this easily (assuming
you've given it enough memory) so I'd go with a large number, say
20M. If you start running out of memory, then you've probably given
the JVM too little memory.

But Solr should handle this without a burp.

Best
Erick

On Wed, Feb 2, 2011 at 10:20 AM, McGibbney, Lewis John 
lewis.mcgibb...@gcu.ac.uk wrote:

 Hello list,

 I am aware that setting the value of maxFieldLength in solrconfig.xml too
 high may/will result in out-of-mem errors. I wish to provide content
 extraction on a number of pdf documents which are large, by large I mean
 8-11MB (occasionally more), and I am also not sure how many terms reside in
 each field when it is indexed. My question is therefore what is a sensible
 number to set this value to in order to include the majority/all terms
 within documents of this size.

 Thank you

 Lewis


 Glasgow Caledonian University is a registered Scottish charity, number
 SC021474

 Winner: Times Higher Education's Widening Participation Initiative of the
 Year 2009 and Herald Society's Education Initiative of the Year 2009.

 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

 Winner: Times Higher Education's Outstanding Support for Early Career
 Researchers of the Year 2010, GCU as a lead with Universities Scotland
 partners.

 http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


DataImportHandler usage with RDF database

2011-02-03 Thread McGibbney, Lewis John
Hello List,

I am very interested in DataImportHandler. I have data stored in an RDF db and 
wish to use this data to boost query results via Solr. I wish to keep this data 
stored in db as I have a web app which directly maintains this db. Is it 
possible to use a DataImportHandler to read RDF data from db in memory, without 
sending an index commit to Solr. As far as I can see DataImportHandler 
currently supports full and delta imports which mean I would be indexing. So 
far I have yet to find a requestHandler which is able to read then store data 
in memory, then use this data elsewhere prior to returning documents via 
queryResponseWriter.

Can anyone provide their thoughts/insight

Thank you

Lewis


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education's Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html


Next steps in loading plug-in

2011-02-01 Thread McGibbney, Lewis John
Hi list,

Having had a thorough look at the wiki over the weekend and doing some testing 
myself I have some additional questions regarding loading my plug-in to Solr. 
Taking the 'Old Way' to loading plug-ins, I have JARred up the relevant classes 
and added the JAR to the web app WEB-INF/lib dir. I am unsure of next steps to 
take as my plug-in has extension properties (which specify web-based OWL files 
which I wish to use whenever the plug-in is invoked). My main question would be 
where I would include these config properties? My initial thoughts are that 
they would be included within  WEB-INF/web.xml but I am unsure as to how to 
include them. I have had a good look at web.xml and think that they could be 
included as init-param's but this is solely due to my lack of knowledge in 
this situation.

Thank you

Lewis


Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html


Adding plug-in to Solr

2011-01-31 Thread McGibbney, Lewis John
Hello list,

I am attempting to port a plug-in to my Solr implementation and would like to 
discuss best practice for doing so. The plug-in relates specifically to the 
query submitted through Solr, the idea is to provide some sort of query 
'refinement' mechanism relating t a specific domain. Some information of a 
similar type of plug-in can be found here

http://wiki.apache.org/nutch/OntologyPlugin

My question really relates to what config files I need to be consulting when 
adding plug-ins to Solr and would like to ask for users' experience with this 
type of experiment.

Any comments would be great

Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html


RE: unknown field 'name'

2010-11-24 Thread McGibbney, Lewis John
I took a look at schema.xml and you are right the field names did not match up 
correctly. It was my own fault.

Thank you for your time sivaprasad and Markus


From: sivaprasad [sivaprasa...@echidnainc.com]
Sent: 24 November 2010 03:58
To: solr-user@lucene.apache.org
Subject: RE: unknown field 'name'

The field names in the xml and schema.xml should be matched

-Original Message-
From: McGibbney, Lewis John [via Lucene] 
ml-node+1956387-780012783-225...@n3.nabble.com
Sent: Tuesday, November 23, 2010 4:01pm
To: sivaprasad sivaprasa...@echidnainc.com
Subject: unknown field 'name'

Good Evening List,

I have been working with Nutch and due to numerous integration advantages I 
decided to get to grips with the Solr code base.

Solr dist - 1.4.1
java version 1.6.0_22
Windows Vista Home Premium
Command Prompt to execute commands

I encountered the following problem very early on during indexing stage, and 
even though I asked this question (through the wrong list :0|) I have been 
unable to resolve what it is thats going wrong. My searches to date pick up 
hits relating to Db problems and are of no use. I have a new dist of Solr and 
have made no configuration to date.

C:\Users\Mcgibbney\Documents\LEWIS\apache-solr-1.4.1\apache-solr-1.4.1\example\e
xampledocsjava -jar post.jar *.xml
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, othe
r encodings are not currently supported
SimplePostTool: POSTing files to [http://localhost:8983/solr/update] 
http://localhost:8983/solr/update..
SimplePostTool: POSTing file hd.xml
SimplePostTool: FATAL: Solr returned an error: ERRORunknown_field_name

Help would be great.

Lewis Mc

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009
[http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html]
 
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html



View message @ 
[http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1956387.html] 
http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1956387.html
To start a new topic under Solr - User, email 
ml-node+472068-1030716887-225...@n3.nabble.com
To unsubscribe from Solr - User, 
[http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472068code=c2l2YXByYXNhZC5qQGVjaGlkbmFpbmMuY29tfDQ3MjA2OHwtMjAyODMzMTY4OQ==]
 click here.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/unknown-field-name-tp1956387p1958454.html
Sent from the Solr - User mailing list archive at Nabble.com.

Email has been scanned for viruses by Altman Technologies' email management 
service - www.altman.co.uk/emailsystems

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html


unknown field 'name'

2010-11-23 Thread McGibbney, Lewis John
Good Evening List,

I have been working with Nutch and due to numerous integration advantages I 
decided to get to grips with the Solr code base.

Solr dist - 1.4.1
java version 1.6.0_22
Windows Vista Home Premium
Command Prompt to execute commands

I encountered the following problem very early on during indexing stage, and 
even though I asked this question (through the wrong list :0|) I have been 
unable to resolve what it is thats going wrong. My searches to date pick up 
hits relating to Db problems and are of no use. I have a new dist of Solr and 
have made no configuration to date.

C:\Users\Mcgibbney\Documents\LEWIS\apache-solr-1.4.1\apache-solr-1.4.1\example\e
xampledocsjava -jar post.jar *.xml
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, othe
r encodings are not currently supported
SimplePostTool: POSTing files to http://localhost:8983/solr/update..
SimplePostTool: POSTing file hd.xml
SimplePostTool: FATAL: Solr returned an error: ERRORunknown_field_name

Help would be great.

Lewis Mc

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 
2009 and Herald Society's Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html