Re: Data Import Handler Rich Format Documents

2010-06-18 Thread Sixten Otto
On Fri, Jun 18, 2010 at 2:42 PM, Chris Hostetter wrote: > I'm confused ... You're using DIH, and some of your fields are URLs to > documents that you want to parse with Tika? > > Why would you need a custom Transformer? Yeah, I can definitely vouch that DIH can handle this without additional codi

Re: HOWTO get a working copy of SOLR?

2010-06-15 Thread Sixten Otto
On Tue, Jun 15, 2010 at 12:58 AM, Bernd Fehling wrote: > - changed to SOLR branch_3x. Installs fine, runs fine, luke works fine but >  the extraction with /update/extract (ExtractingRequestHandler) only replies >  the metadata but not the content. Sounds like https://issues.apache.org/jira/browse

Re: Tomcat startup script

2010-06-09 Thread Sixten Otto
On Tue, Jun 8, 2010 at 4:18 PM, wrote: > The following should work on centos/redhat, don't forget to edit the paths, > user, and java options for your environment. You can use chkconfig to add it > to your startup. Thanks, Colin. Sixten

Re: TikaEntityProcessor on Solr 1.4?

2010-06-08 Thread Sixten Otto
2010/5/22 Noble Paul നോബിള്‍ नोब्ळ् : > just copy the dih-extras jar file from the nightly should be fine Now that I've finally got a server on which to attempt to set these things up... this turns out not to be a viable solution. The extras jar does contain the TikaEntityProcessor class, but NOT

Re: Tomcat startup script

2010-06-08 Thread Sixten Otto
On Tue, Jun 8, 2010 at 11:00 AM, K Wong wrote: > Okay. I've been running multicore Solr 1.4 on Tomcat 5.5/OpenJDK 6 > straight out of the centos repo and I've not had any issues. We're not > doing anything wild and crazy with it though. It's nice to know that the wiki's advice might be out of dat

Re: Tomcat startup script

2010-06-08 Thread Sixten Otto
On Mon, Jun 7, 2010 at 9:23 PM, K Wong wrote: > Did you install tomcat 5.5 from an RPM? I did not, on the advice of that same Solr wiki article that manual installation is "recommended because distribution Tomcats are either old or quirky." There haven't been any issues with this, except that the

Re: Tomcat startup script

2010-06-07 Thread Sixten Otto
On Mon, Jun 7, 2010 at 2:35 PM, Chris Hostetter wrote: > there is currently a bug with the apache wiki and attachments... > https://issues.apache.org/jira/browse/INFRA-2773 Glad to know it's not just me. But does anyone have that script posted anywhere else? Sixten

Tomcat startup script

2010-06-07 Thread Sixten Otto
So, looking at the wiki article on setting up Solr with Tomcat (http://wiki.apache.org/solr/SolrTomcat), there's a link to an attached init.d script for CentOS/RedHat/Fedora. Trouble is, the wiki won't let me access it. Even after creating an account and logging in, clicking on the link (http://wik

Re: How real-time are Solr/Lucene queries?

2010-05-26 Thread Sixten Otto
On Wed, May 26, 2010 at 11:30 AM, Thomas J. Buhr wrote: > Basically, I need to know that issuing searches to a local index will not be > slower than searching a hashmap or array. How different or similar will the > performance be? If you don't mind my asking... I'm still trying to understand wh

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
On Fri, May 21, 2010 at 5:30 PM, Chris Harris wrote: > Actually, rather than cherry-pick just the changes from SOLR-1358 and > SOLR-1583 what I did was to merge in all DataImportHandler-related > changes from between the 1.4 release up through Solr trunk r890679 > (inclusive). I'm not sure if that

Re: TikaEntityProcessor on Solr 1.4?

2010-05-21 Thread Sixten Otto
2010/5/19 Noble Paul നോബിള്‍ नोब्ळ् : > I guess it should work because Tika Entityprocessor does not use any > new 1.4 APIs > > On Wed, May 19, 2010 at 1:17 AM, Sixten Otto wrote: >> The TikaEntityProcessor class that enables DataImportHandler to >> process business docum

TikaEntityProcessor on Solr 1.4?

2010-05-18 Thread Sixten Otto
Sorry to repeat this question, but I realized that it probably belonged in its own thread: The TikaEntityProcessor class that enables DataImportHandler to process business documents was added after the release of Solr 1.4, along with some other changes (like the binary DataSources) to support it.

Re: Which Solr to use?

2010-05-18 Thread Sixten Otto
On Tue, May 18, 2010 at 10:40 AM, Robert Muir wrote: > Some discussions/voting happened and the trunk is intended to be ... > more like a normal trunk. > > If you need features not in an official release, and are looking for a > codebase with updated features, I would recommend instead considering

Which Solr to use?

2010-05-17 Thread Sixten Otto
I've been investigating Solr on and off as a (or even the) search solution for my employer's content management solution. One of the biggest questions in my mind at this point is which version to go with. In general, 1.4 would seem the obvious choice, as it's the only released version on that list.