Re: Debugging custom RequestHander: spinning up a core for debugging

2017-12-22 Thread Tod Olson
Thanks, that pointed me in the right direction! The problem was an ancient ICU library in the distributed code. -Tod On Dec 15, 2017, at 5:15 PM, Erick Erickson <erickerick...@gmail.com<mailto:erickerick...@gmail.com>> wrote: My guess is this isn't a Solr issue at all; you are s

Debugging custom RequestHander: spinning up a core for debugging

2017-12-15 Thread Tod Olson
lr core loaded!"); } @AfterClass public static void cleanUpClass() { core.close(); container.shutdown(); logger.info<http://logger.info>("Solr core shut down!"); } } The test, run through ant, fails as follows: [junit] solr.solr.home= /

Re: Compile problems with anonymous SimpleCollector in custom request handler

2017-11-30 Thread Tod Olson
: Classpath: ${classpathProp} -Tod On Nov 29, 2017, at 6:00 PM, Shawn Heisey <apa...@elyograg.org<mailto:apa...@elyograg.org>> wrote: On 11/29/2017 2:27 PM, Tod Olson wrote: I'm modifying a existing custom request handler for an open source project, and am looking f

Compile problems with anonymous SimpleCollector in custom request handler

2017-11-29 Thread Tod Olson
1.8 and Solr 6.4.2. There are two things I do not understand. First: [javac] /Users/tod/src/vufind-browse-handler/browse-handler/java/org/vufind/solr/handler/BrowseRequestHandler.java:445: error: is not abstract and does not override abstract method setNextReader(AtomicReaderContext

Upgrading Tika in place

2013-02-05 Thread Tod
without needing to rebuild or upgrade Solr. Is that a possibility and if so how would I go about accomplishing it? I see tika-core and tika-parsers in the 3.6.2 Solr build distro, is that the only two files I need? Thanks - Tod

Solr 3.6 parsing and extraction files

2012-04-18 Thread Tod
? Thanks - Tod

Indexing Using XML Message

2012-01-25 Thread Tod
appropriate way to accomplish this? I could use the Tika CLI to generate XML but I'm not sure it would work or that its the most efficient way to handle things. Can anyone offer some suggestions? Thanks - Tod

Re: Help! - ContentStreamUpdateRequest

2011-11-16 Thread Tod
dumb. I'll be happy to share any information about my environment or configuration if it will help find my error. Thanks for all of your help. - Tod On 11/15/2011 8:08 PM, Erick Erickson wrote: That's odd. What are your autocommit parameters? And are you either committing or optimizing

Re: Help! - ContentStreamUpdateRequest

2011-11-15 Thread Tod
documents it needs to index in chunks rather than one at a time as I'm doing now. The one at a time approach is locking up the Solr server at around 700 entries. My thought was if I could chunk them in a batch at a time the lockup will stop and indexing performance would improve. Thanks - Tod

Help! - ContentStreamUpdateRequest

2011-11-14 Thread Tod
Could someone take a look at this page: http://wiki.apache.org/solr/ContentStreamUpdateRequestExample ... and tell me what code changes I would need to make to be able to stream a LOT of files at once rather than just one? It has to be something simple like a collection of some sort but I

Batch indexing documents using ContentStreamUpdateRequest

2011-11-04 Thread Tod
missing? Thanks - Tod

Re: Batch indexing documents using ContentStreamUpdateRequest

2011-11-04 Thread Tod
won't budge it. On 11/04/2011 12:36 PM, Tod wrote: This is a code fragment of how I am doing a ContentStreamUpdateRequest using CommonHTTPSolrServer: ContentStreamBase.URLStream csbu = new ContentStreamBase.URLStream(url); InputStream is = csbu.getStream(); FastInputStream fis = new

can solr follow and index hyperlinks embedded in rich text documents (pdf, doc, etc)?

2011-10-21 Thread Tod
. Thanks - Tod

Re: java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log

2011-10-21 Thread Tod
On 10/19/2011 2:58 PM, wrote: Hi Tod, I had similar issue with slf4j, but it was NoClassDefFound. Do you have some other dependencies in your application that use some other version of slf4j? You can use mvn dependency:tree to get all dependencies in your application. Or maybe there's some

java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log

2011-10-19 Thread Tod
- Tod

Re: Instructions for Multiple Server Webapps Configuring with JNDI

2011-10-18 Thread Tod
On 10/14/2011 2:44 PM, Chris Hostetter wrote: : modified the solr/home accordingly. I have an empty directory under : tomcat/webapps named after the solr home directory in the context fragment. if that empty directory has the same base name as your context fragment (ie: tomcat/webapps/solr0

Please help - Solr Cell using 'stream.url'

2011-10-07 Thread Tod
that is needed. Thanks - Tod

Solr read timeout

2011-08-18 Thread Tod
on resolving this problem rather than blanket tweaking the entire config. Is there anything in particular I should look at? Can I provide any more information? Thanks - Tod

Most current tik jar files that work with Solr 1.4.1

2011-08-17 Thread Tod
What is the latest version of Tika that I can use with Solr 1.4.1? it comes packaged with 0.4. I tried 0.8 and it no workie.

Re: ContentStreamLoader Problem

2011-07-13 Thread Tod
on it? If that works, then it's likely a classpath issue Best Erick I'll give it a shot and report back. Thanks - Tod

ContentStreamLoader Problem

2011-07-12 Thread Tod
already have an existing 1.4.1 instance running, could that be causing the problem? Thanks - Tod Jul 12, 2011 1:11:31 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 1 Jul 12, 2011 1:11:31 PM org.apache.solr.common.SolrException log SEVERE

tika.parser.AutoDetectParser

2011-07-01 Thread Tod
but this message seems to contradict that unless I'm missing a jar somewhere. I've got both dataimporthandler jar files in my WEB-INF/lib dir so not sure what I could be missing. Any ideas? Thanks - Tod

Re: tika.parser.AutoDetectParser

2011-07-01 Thread Tod
On 07/01/2011 12:59 PM, Shawn Heisey wrote: On 7/1/2011 9:23 AM, Tod wrote: I'm working on upgrading to v3.2 from v 1.4.1. I think I've got everything working but when I try to do a data import using dataimport.jsp I'm rolling back and getting class not found exception on the above referenced

Re: Default schema - 'keywords' not multivalued

2011-06-29 Thread Tod
the schema then. The problem with TikaEntityProcessor is this installation is still running v1.4.1 so I'll need to upgrade. Any short and sweet instructions for upgrading to 3.2? I have a pretty straight forward Tomcat install, would just dropping in the new war suffice? - Tod

Re: Default schema - 'keywords' not multivalued

2011-06-28 Thread Tod
On 06/27/2011 11:23 AM, lee carroll wrote: Hi Tod, A list of keywords would be fine in a non multi valued field: keywords : xxx yyy sss aaa multi value field would allow you to repeat the field when indexing keywords: xxx keywords: yyy keywords: sss etc Thanks Lee. the problem is I'm

Default schema - 'keywords' not multivalued

2011-06-27 Thread Tod
This was a little curious to me and I wondered what the thought process was behind it before I decide to change it. Thanks - Tod

Tika Jax-RS and DIH

2011-06-22 Thread Tod
. Thanks - Tod

Indexing Mediawiki

2011-06-07 Thread Tod
and would be better off dumping and indexing the wiki instead? Thanks - Tod

Can ExtractingRequestHandler ignore documents metadata

2011-05-09 Thread Tod
I'm indexing content from a CMS' database of metadata. The client would prefer that Solr exclude the properties (metadata) of any documents being indexed. Is there a way to tell Tika to only index a document's text and not its properties? Thanks - Tod

Opensearch Format Support

2011-01-20 Thread Tod
Does Solr support the Opensearch format? If so could someone point me to the correct documentation? Thanks - Tod

Re: Retrieving indexed content containing multiple languages

2010-11-16 Thread Tod
to give everybody else a leg up. - Tod - Original Message From: Todlistac...@gmail.com To: solr-user@lucene.apache.org Sent: Thu, November 11, 2010 11:35:23 AM Subject: Retrieving indexed content containing multiple languages My Solr corpus is currently created by indexing metadata

Re: Any Copy Field Caveats?

2010-11-11 Thread Tod
I've noticed that using camelCase in field names causes problems. On 11/5/2010 11:02 AM, Will Milspec wrote: Hi all, we're moving from an old lucene version to solr and plan to use the Copy Field functionality. Previously we had rolled our own implementation, sticking title, description,

Retrieving indexed content containing multiple languages

2010-11-11 Thread Tod
on how to handle these types of language challenges? Thanks in advance - Tod

Chinese characters - a little OT

2010-11-10 Thread Tod
hunch is I should utf-8 encode the title and then try and display the result but its nor working. I still am seeing the unicode characters. Does anyone see what I could be doing wrong? TIA - Tod

Re: Phrase Query Problem?

2010-11-02 Thread Tod
this both with and without quotes. What could I be doing wrong? Thanks - Tod Tod, Without knowing your exact field definition, my first guess would be your first boolean query; because it is not quoted, what SOLR typically does is to transform that type of query into something like (assuming your

Re: Phrase Query Problem?

2010-11-02 Thread Tod
the quotes to have triggered a phrase query. Thanks for your help. - Tod

Facet count of zero

2010-11-01 Thread Tod
the other foo's show up with valid counts. Can I do this? Is my syntax incorrect? Thanks - Tod

Re: Facet count of zero

2010-11-01 Thread Tod
do this? �Is my syntax incorrect? Thanks - Tod Excellent, I completely missed it - thanks!

Phrase Query Problem?

2010-11-01 Thread Tod
)OR(mykeywords:ALL)))start=0indent=truewt=json Should, with an exact match, return only one entry but it returns five some of which don't have any of the fields I've specified. I've tried this both with and without quotes. What could I be doing wrong? Thanks - Tod

Overriding Tika's field processing

2010-10-28 Thread Tod
the 'literal.title' processing correctly? Does anybody have experience/suggestions on how to handle this? Thanks - Tod

Re: UpdateXmlMessage

2010-10-04 Thread Tod
The wiki docs were a little sparse on this one. - Tod Tod wrote: I can do this using GET: http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery%3Eoffice:Bridgewater%3C/query%3E%3C/delete%3E http://localhost:8983/solr/update?stream.body=%3Ccommit/%3E ... but can I pass

UpdateXmlMessage

2010-10-01 Thread Tod
I can do this using GET: http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery%3Eoffice:Bridgewater%3C/query%3E%3C/delete%3E http://localhost:8983/solr/update?stream.body=%3Ccommit/%3E ... but can I pass a stream.url parameter using an UpdateXmlMessage? I looked at the schema and

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-19 Thread Tod
is pointing to the solr.request(req) line. Thanks - Tod

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-18 Thread Tod
=true' ... works fine - I just want to do it a LOT and as efficiently as possible. If I have to I can wrap it in a perl script and run a cURL or LWP loop but I'd prefer to use SolrJ if I can. Thanks for all your help. - Tod

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-13 Thread Tod
to tell Solr where the document lived so it could go out and stream it into the index for me. That's where I thought StreamingUpdateSolrServer would help. - Tod

Re: Solrj ContentStreamUpdateRequest Slow

2010-08-06 Thread Tod
a feeling I'm doing something dumb but just can't seem to pinpoint the exact problem. Thanks - Tod code--- import java.io.File; import java.io.IOException; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import

Solrj ContentStreamUpdateRequest Slow

2010-08-04 Thread Tod
pdf file, there are no firewall issues, solr is running on the same machine, and I tried the actual host name in addition to localhost but nothing helps. Thanks - Tod http://wiki.apache.org/solr/ContentStreamUpdateRequestExample

Supplementing already indexed data

2010-07-11 Thread Tod
I'm getting metadata from a RDB but the actual content is stored somewhere else. I'd like to index the content too but I don't want to overlay the already indexed metadata. I know this can be done but I just can't seem to dig up the correct docs, can anyone point me in the right direction?

Re: Data Import Handler Rich Format Documents

2010-07-06 Thread Tod
be after I checked out and built from trunk? Thanks - Tod

Indexing Rich Format Documents using Data Import Handler (DIH) and the TikaEntityProcessor

2010-06-23 Thread Tod
with the error above. Thanks - Tod

Data Import Handler Rich Format Documents

2010-06-18 Thread Tod
and maybe without even needing to use Nutch. I'm using the current release version of Solr. Thanks in advance. - Tod

Re: Data Import Handler Rich Format Documents

2010-06-18 Thread Tod
On 6/18/2010 9:12 AM, Otis Gospodnetic wrote: Tod, You didn't mention Tika, which makes me think you are not aware of it... You could implement a custom Transformer that uses Tika to perform rich doc text extraction, just like ExtractingRequestHandler does it (see http://wiki.apache.org/solr

Re: Data Import Handler Rich Format Documents

2010-06-18 Thread Tod
On 6/18/2010 11:24 AM, Otis Gospodnetic wrote: Tod, I don't think DIH can do that, but who knows, let's see what others say. Yes, Nutch uses TIKA, too. Otis Looks like the ExtractingRequestHandler uses Tika as well. I might just use this but I'm wondering if there will be a large

Re: Data Import Handler Rich Format Documents

2010-06-18 Thread Tod
ExtractingRequestHandler to push everything into the index. I just wanted to see what everybody else is doing and what my other options might be. Thanks - Tod Ref: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Searching-rich-format-documents-stored-DBMS

Re: JSON formatted response from SOLR question....

2010-05-11 Thread Tod
Jon, Yes!!! rsp.facet_counts.facet_fields.['var'].length to rsp.facet_counts.facet_fields[var].length and voila. Tripped up on a syntax error, how special. Just needed another set of eyes - thanks. VelocityResponse duly noted, it will come in handy later. - Tod On 5/10/2010 4:55 PM, Jon

JSON formatted response from SOLR question....

2010-05-10 Thread Tod
I apologize, this is such a JSON/javascript question but I'm stuck and am not finding any resources that address this specifically. I'm doing a faceted search and getting back in my facet_counts.faceted_fields response an array of countries. I'm gathering the count of the array elements