Re: DIH problem with multiple (types of) resources
hi, On Tue, Nov 15, 2016 at 02:54:49AM +1100, Alexandre Rafalovitch wrote: >> >> > Attribute names are case sensitive as far as I remember. Try > 'dataSource' for the second definition. oh wow... that's sneaky. in the old version the case didn't seem to matter, but now it certainly does. thx :) -- CUL8R, Peter. www.desk.nl Your excuse is: It is a layer 8 problem
DIH problem with multiple (types of) resources
hi, I'm porting an old data-import configuratie from 4.x to 6.3.0. a minimal config is this : http://site/nl/${page.pid}"; format="text"> when I try to do a full import with this, I get : 2016-11-14 12:31:52.173 INFO (Thread-68) [ x:meulboek] o.a.s.u.p.LogUpdateProcessorFactory [meulboek] webapp=/solr path=/dataimport params={core=meulboek&optimize=false&indent=on&commit=true&clean=true&wt=json&command=full-import&_=1479122291861&verbose=true} status=0 QTime=11{deleteByQuery=*:* (-1550976769832517632),add=[ed99517c-ece9-40c6-9682-c9ec74173241 (1550976769976172544), 9283532a-2395-43eb-bcb8-fd30c5ebfd08 (1550976770348417024), 87b75d5c-a12a-4538-bc29-ceb13d6a9d1c (1550976770455371776), 476b5da3-3752-4867-bdb3-4264403c5c2d (1550976770787770368), 71cdaadb-62ba-4753-ad1b-01ba7fd75bfa (1550976770875850752), 02f41269-4a28-4001-aab9-7b1feb51e332 (1550976770954493952), 6216ec48-2abd-465b-8d6b-60907c7f49db (1550976771047817216), 4317b308-dc88-47e1-9240-0d7d94646de6 (1550976771136946176), 159ee092-2f72-45f6-970e-9dfd6d635bdf (1550976771221880832), bdfa48c4-23e2-483f-9b63-e0c5753d60a5 (1550976771336175616)]} 0 1465 2016-11-14 12:31:52.173 ERROR (Thread-68) [ x:meulboek] o.a.s.h.d.DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11 at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475) at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11 at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) ... 4 more Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception in invoking url null Processing Document # 11 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69) at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:89) at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:38) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) ... 6 more Caused by: java.net.MalformedURLException: no protocol: nullselect edition from editions at java.net.URL.(URL.java:593) at java.net.URL.(URL.java:490) at java.net.URL.(URL.java:439) at org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:81) ... 12 more note that this failure occurrs with the second entity, and judging from this line : Caused by: java.net.MalformedURLException: no protocol: nullselect edition from editions it seems solr tries to use the datasource named "web" (the BinURLDataSource) instead of the configured "db" datasource (the JdbcDataSource). am I doing something wrong, or is this a bug ? -- CUL8R, Peter. www.desk.nl Your excuse is: Communist revolutionaries taking over the server room and demanding all the computers in the building or they shoot the sysadmin. Poor misguided fools.
Re: ranged and boolean query
hi, On Wed, Nov 17, 2010 at 04:39:00PM +0100, Peter Blokland wrote: > i'm using solr and am trying to limit my resultset to documents > that either have a publication date in the range * to now, or > have no publication date set at all (field is not present). > however, using this : > > (pubdate:[* TO NOW]) OR ( NOT pubdate:*) > > gives me only the documents in the range * to now (reversing the > two clauses has no effect). answering my own question : the above expresseion was a filter-query, where the main query was (e.g.) type:page when only using the left-hand expression, this evaluates to type:page NOT pubdate:* which is a valid query. however, using the full expression seems to make lucene evaluate NOT pubdate:* as a query, which is not legal, and returns an empty result. so, re- writing the filter-query as (type:page AND pubdate:[* TO NOW]) OR (type:page NOT pubdate:*) solved my problem... took me long enough... -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
Re: ranged and boolean query
hi, On Wed, Nov 17, 2010 at 05:00:04PM +0100, Peter Blokland wrote: >>> pubdate:([* TO NOW] OR (NOT *)) i've gone back to the examples provided with solr 1.4.1. the standard example has 19 documents, one of which has a date-field called 'incubationdate_dt'. so the query incubationdate_dt:[* TO NOW] is expected to return 1 document, which it does. the query -incubationdate_dt:* is expected to return 18 documents, which it does. however, incubationdate_dt:[* TO NOW] (-incubationdate_dt:*) which should (imho) return all 19 documents just returns the one document that has such a field. can anyone confirm whether or not this is expected behavior, and if so, why ? -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
Re: ranged and boolean query
hi, On Wed, Nov 17, 2010 at 10:54:48AM -0500, Ken Stanley wrote: > > pubdate:([* TO NOW] OR (NOT *)) > Instead of using NOT, try simply prefixing the field name with a minus > sign. This tells SOLR to exclude the field. Otherwise, the word NOT > would be treated as a term, and would be applied against your default > field (which may or may not affect your results). So instead of > (pubdate:[* TO NOW]) OR ( NOT pubdate:*), you would write (pubdate:[* > TO NOW]) OR ( -pubdate:*). tried that, it gives me exactly the same result... I can't really figure out what's going on. -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
ranged and boolean query
hi. i'm using solr and am trying to limit my resultset to documents that either have a publication date in the range * to now, or have no publication date set at all (field is not present). however, using this : (pubdate:[* TO NOW]) OR ( NOT pubdate:*) gives me only the documents in the range * to now (reversing the two clauses has no effect). using only NOT pubdate:* gives me the correct set of documents (those not having a pubddate). any reason the OR does not work in this case ? ps: also tried it like this : pubdate:([* TO NOW] OR (NOT *)) which gives the same result. -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
Re: Solr PHP PECL Extension going to Stable Release - Wishing for Any New Features?
hi, On Mon, Oct 11, 2010 at 01:03:07AM -0400, Israel Ekpo wrote: > If you are using Solr via PHP and would like to see any new features in the > extension please feel free to send me a note. I'm currently testing a setup with Solr via PHP, and was wondering if support for the ExtractingRequestHandler is planned ? It may be that I missed something in the documentation, but for now it looks like I need to build my own POST's to the /solr/update/extract handler. -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™
TikaEntityProcessor and metadata
hi, I'm using Solr to index document both through a combination of DataImportHandler/TikaEntityProcessor and Solr's ExtractingRequestHandler. The latter gives me the option of dynamically mapping metadata to fields using "uprefix='attr_'" in the configuration. Is it possible to do the same thing from DIH _without_ exhaustively mapping all (possible) fields myself ? -- CUL8R, Peter. www.desk.nl --- Sent from my NetBSD-powered Talkie Toaster™