Re: Checking Optimal Values for BM25

2016-12-15 Thread Sascha Szott
Hi Furkan, in order to change the BM25 parameter values k1 and b, the following XML snippet needs to be added in your schema.xml configuration file: 1.3 0.7 It is even possible to specify the SimilarityFactory on individual index fields. See [1] for more details. Best Sascha [1]

Re: field length within BM25 score calculation in Solr 6.3

2016-12-15 Thread Sascha Szott
of fieldLength does not match 8. Is there same "magic“ applied to the value of field length that goes beyond the standard BM25 score formula? If so, what is the idea behind this modification. If not, is this a Lucene / Solr bug? Best regards, Sascha -- Sascha Szott :: KOBV/ZIB :: +49 30 84185-457

field length within BM25 score calculation in Solr 6.3

2016-12-04 Thread Sascha Szott
Hi folks, my Solr index consists of one document with a single valued field "title" of type "text_general". The title field was index with the content: 1 2 3 4 5 6 7 8 9. The field type text_general uses a StandardTokenizer which should result in 9 tokens. The corresponding length of field

Re: Problem of facet on 170M documents

2013-11-02 Thread Sascha SZOTT
Hi Ming, which Solr version are you using? In case you use one of the latest versions (4.5 or above) try the new parameter facet.threads with a reasonable value (4 to 8 gave me a massive performance speedup when working with large facets, i.e. nTerms 10^7). -Sascha Mingfeng Yang wrote: I

intersection of filter queries with raw query parser

2013-05-31 Thread Sascha Szott
Hi folks, is it possible to use the raw query parser with a disjunctive filter query? Say, I have a field 'foo' and two values 'v1' and 'v2' (the field values are free text and can contain any character). What I want is to retrieve all documents satisying fq=foo:(v1 OR v2). In case only one

Re: Does SolrCloud support distributed IDFs?

2012-10-22 Thread Sascha SZOTT
Hi Mark, Mark Miller wrote: Still waiting on that issue. I think Andrzej should just update it to trunk and commit - it's option and defaults to off. Go vote :) Sounds like the problem is already solved and the remaining work consists of code integration? Can somebody estimate how much work

Does SolrCloud support distributed IDFs?

2012-10-21 Thread Sascha Szott
Hi folks, a known limitation of the old distributed search feature is the lack of distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in this direction? Best regards, Sascha

Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi, wildcard and fuzzy queries are not analyzed. -Sascha Alok Bhandari alokomprakashbhand...@gmail.com schrieb: Hello , I am pushing Chuck Follett'.?.? in solr and when I query for this field with query string field:Follett'.* I am getting 0 results. field type declared is fieldType

Re: Prefix query is not analysed?

2012-07-02 Thread Sascha Szott
Hi, I suppose you are using Solr 3.6. Then take a look at http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/ -Sascha Alok Bhandari alokomprakashbhand...@gmail.com schrieb: Thanks for reply. If I check the debug query through

Re: indexing documents in Apache Solr using php-curl library

2012-07-02 Thread Sascha SZOTT
Hi, perhaps it's better to use a PHP Solr client library. I used https://code.google.com/p/solr-php-client/ in a project of mine and it worked just fine. -Sascha Asif wrote: I am indexing the file using php curl library. I am stuck here with the code echo Stored in: . upload/ .

Re: how to retrieve a doc from its docID ?

2012-06-30 Thread Sascha Szott
Hi, did you include the fl parameter in the Solr query URL? If that's the case make sure that the field name 'text' is mentioned there. You should also make sure that the field definition (in schema.xml) for 'text' says stored=true, otherwise the field will not be returned. -Sascha

Re: querying thru solritas gives me zero results

2012-06-30 Thread Sascha Szott
Hi, Solritas uses the dismax query parser. The dismax config parameter 'qf' specifies the index fields to be searched in. Make sure that 'name' is your default search field. -Sascha Giovanni Gherdovich g.gherdov...@gmail.com schrieb: Hi all, this morning I was very proud of myself since

Re: Searching for digits with strings

2012-06-27 Thread Sascha Szott
Hi, as far as I know Solr does not provide such a feature. If you cannot make any assumptions on the numbers, choose an appropriate library that is able to transform between numerical and non-numerical representations and populate the search field with both versions at index-time. -Sascha

Re: getting started

2011-06-16 Thread Sascha SZOTT
Hi Mari, it depends ... * How many records are stored in your MySQL databases? * How often will updates occur? * How many db records / index documents are changed per update? I would suggest to start with a single Solr core first. Thereby, you can concentrate on the basics and do not need to

Re: Solr coding

2011-03-23 Thread Sascha Szott
Hi, depending on your needs, take a look at Apache ManifoldCF. It adds document-level security on top of Solr. -Sascha On 23.03.2011 14:20, satya swaroop wrote: Hi All, As for my project Requirement i need to keep privacy for search of files so that i need to modify the code of

Re: Search failing for matched text in large field

2011-03-23 Thread Sascha Szott
Hi Paul, did you increase the value of the maxFieldLength parameter in your solrconfig.xml? -Sascha On 23.03.2011 17:05, Paul wrote: I'm using solr 1.4.1. I have a document that has a pretty big field. If I search for a phrase that occurs near the start of that field, it works fine. If I

Re: Search failing for matched text in large field

2011-03-23 Thread Sascha Szott
On 23.03.2011 18:52, Paul wrote: I increased maxFieldLength and reindexed a small number of documents. That worked -- I got the correct results. In 3 minutes! Did you mark the field in question as stored = false? -Sascha I assume that if I reindex all my documents that all searches will

Re: Index MS office

2011-02-02 Thread Sascha Szott
Hi, have a look at Solr's ExtractingRequestHandler: http://wiki.apache.org/solr/ExtractingRequestHandler -Sascha On 02.02.2011 16:49, Thumuluri, Sai wrote: Good Morning, I am planning to get started on indexing MS office using ApacheSolr - can someone please direct me where I should

Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott
Hi folks, I've made the same observation when working with Solr's ExtractingRequestHandler on the command line (no browser interaction). When issuing the following curl command curl 'http://mysolrhost/solr/update/extract?extractOnly=trueextractFormat=textwt=xmlresource.name=foo.pdf'

Re: Malformed XML with exotic characters

2011-02-01 Thread Sascha Szott
perfectly with the same returned data in some AJAX environment. On Tuesday 01 February 2011 18:29:06 Sascha Szott wrote: Hi folks, I've made the same observation when working with Solr's ExtractingRequestHandler on the command line (no browser interaction). When issuing the following curl command

missing type check when working with pint field type

2011-01-18 Thread Sascha Szott
Hi folks, I've noticed an unexpected behavior while working with the various built-in integer field types (int, tint, pint). It seems as the first two ones are subject to type checking, while the latter one is not. I'll give you an example based on the example schema that is shipped out

Re: missing type check when working with pint field type

2011-01-18 Thread Sascha Szott
) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) [...] Is this a bug or did I missed something? -Sascha -- Sascha Szott :: KOBV

Re: post search using solrj

2010-12-30 Thread Sascha SZOTT
Hi Don, you could give the HTTP method to be used as a second argument to the QueryRequest constructor:

DataImportHandler in Solr 1.4.1: exception handling in FileListEntityProcessor

2010-08-11 Thread Sascha Szott
Hi folks, why does FileListEntityProcessor ignores onError=continue and abort indexing if a directory or a file does not exist? I'm using both XPathEntityProcessor and FileListEntityProcessor with onError set to continue. In case a directory or file is not present an Exception is thrown and

Re: DataImportHandler in Solr 1.4.1: exception handling in FileListEntityProcessor

2010-08-11 Thread Sascha Szott
(DataImporter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) -Sascha On 11.08.2010 15:18, Sascha Szott wrote: Hi folks, why does FileListEntityProcessor ignores

Re: problem with formulating a negative query

2010-07-06 Thread Sascha Szott
Hi, Chris Hostetter wrote: AND, OR, and NOT are just syntactic-sugar for modifying the MUST, MUST_NOT, and SHOULD. The default op of OR only affects the first clause of your query (R) because it doesn't have any modifiers -- Thanks for pointing that out! -Sascha the second clause has that

Re: problem with formulating a negative query

2010-06-30 Thread Sascha Szott
Hi Erick, thanks for your explanations. But why are all docs being *removed* from the set of all docs that contain R in their topic field? This would correspond to a boolean AND and would stand in conflict with the clause q.op=OR. This seems a bit strange to me. Furthermore, Smiley Pugh

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, you can delete all docs that match a certain query: deletequeryuid:6-HOST*/query/delete -Sascha bbarani wrote: Hi, I am trying to delete a group of documents using wildcard. Something like

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, does /select?q=uid:6-HOST* return any documents? -Sascha bbarani wrote: Hi, Thanks a lot for your reply.. I tried the below query update?commit=true%20-H%20Content-Type:%20text/xml%20--data-binary%20'deletequeryuid:6-HOST*/query/delete' But even now none of the documents are getting

Re: Is there a way to delete multiple documents using wildcard?

2010-06-30 Thread Sascha Szott
Hi, take a look inside Solr's log file. Are there any error messages with respect to the update request? Furthermore, you could try the following two commands instead: curl http://host:port/solr/update; --form-string stream.body=deletequeryuid:6-HOST*/query/delete curl

problem with formulating a negative query

2010-06-29 Thread Sascha Szott
Hi folks, I have a (multi-valued) field topic in my index which does not need to exist in every document. Now, I'm struggling with formulating a query that returns all documents that either have no topic field at all *or* whose topic field value is R. Unfortunately, the query

Re: Specifiying multiple mlt.fl fields

2010-06-19 Thread Sascha Szott
Hi Darren, try mlt.fl=field1 field2 Best, Sascha Darren Govoni wrote: Hi, I read the wiki and tried about a dozen variations such as: ...mlt.fl=field1mlt.fl=field2 and ...mlt.fl=field1,field2... to specify more than one MLT field and it won't take. What's the trick? Also, how to do it

Re: federated / meta search

2010-06-18 Thread Sascha Szott
Hi Joe Markus, sounds good! Maybe I should better add a note on the Wiki page on federated search [1]. Thanks, Sascha [1] http://wiki.apache.org/solr/FederatedSearch Joe Calderon wrote: yes, you can use distributed search across shards with different schemas as long as the query only

federated / meta search

2010-06-17 Thread Sascha Szott
Hi folks, if I'm seeing it right Solr currently does not provide any support for federated / meta searching. Therefore, I'd like to know if anyone has already put efforts into this direction? Moreover, is federated / meta search considered a scenario Solr should be able to deal with at all or

Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott
followed by (auskunft or profiauskunft) you mentioned will occur. Best, Sascha -Ursprüngliche Nachricht- Von: Sascha Szott [mailto:sz...@zib.de] Gesendet: Sonntag, 30. Mai 2010 19:01 An: solr-user@lucene.apache.org Betreff: Re: strange results with query and hyphened words Hi Markus, I

Re: strange results with query and hyphened words

2010-05-31 Thread Sascha Szott
by the WordDelimiterFilter. What about using the PatternReplaceCharFilter at query time to eliminate all intra-word hyphens? -Sascha Sascha Szott wrote: Hi Markus, the default-config for index is: filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1

Re: strange results with query and hyphened words

2010-05-30 Thread Sascha Szott
Hi Markus, I was facing the same problem a few days ago and found an explanation in the mail archive that clarifies my question regarding the usage of Solr's WordDelimiterFilterFactory: http://markmail.org/message/qoby6kneedtwd42h Best, Sascha markus.rietz...@rzf.fin-nrw.de wrote: i am

Re: sort by field length

2010-05-26 Thread Sascha Szott
Hi Erick, Erick Erickson wrote: Ah, I may have misunderstood, I somehow got it in my mind you were talking about the length of each term (as in string length). But if you're looking at the field length as the count of terms, that's another question, sorry for the confusion... I have to ask,

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
Hi Birger, Birger Lie wrote: I don't think the bolean fields is mapped to on and off :) You can use true and on interchangeably. -Sascha -birger -Original Message- From: Ilya Sterin [mailto:ster...@gmail.com] Sent: 24. mai 2010 23:11 To: solr-user@lucene.apache.org Subject:

Re: sort by field length

2010-05-25 Thread Sascha Szott
Hi Erick, Erick Erickson wrote: Are you sure you want to recompute the length when sorting? It's the classic time/space tradeoff, but I'd suggest that when your index is big enough to make taking up some more space a problem, it's far too big to spend the cycles calculating each term length for

Re: Highlighting is not happening

2010-05-25 Thread Sascha Szott
=onereturned response which contains bquery/b should be bold/str /doc Regards Prakash -Original Message- From: Sascha Szott [mailto:sz...@zib.de] Sent: Monday, May 24, 2010 10:55 PM To: solr-user@lucene.apache.org Subject: Re: Highlighting is not happening Hi Prakash, can you provide 1

Re: Faceted search not working?

2010-05-25 Thread Sascha Szott
strquery/str strfacet/str /arr /requestHandler On 2010-05-25, at 3:32 AM, Sascha Szott wrote: Hi Birger, Birger Lie wrote: I don't think the bolean fields is mapped to on and off :) You can use true and on interchangeably. -Sascha -birger -Original Message- From: Ilya

sort by field length

2010-05-24 Thread Sascha Szott
Hi folks, is it possible to sort by field length without having to (redundantly) save the length information in a seperate index field? At first, I thought to accomplish this using a function query, but I couldn't find an appropriate one. Thanks in advance, Sascha

Re: Highlighting is not happening

2010-05-24 Thread Sascha Szott
Hi Prakash, more importantly, check the field type and its associated analyzer. In case you use a non-tokenized type (e.g., string), highlighting will not appear if only a partial field match exists (only exact matches, i.e. the query coincides with the field value, will be highlighted). If

Re: Highlighting is not happening

2010-05-24 Thread Sascha Szott
Prakash -Original Message- From: Sascha Szott [mailto:sz...@zib.de] Sent: Monday, May 24, 2010 10:29 PM To: solr-user@lucene.apache.org Subject: Re: Highlighting is not happening Hi Prakash, more importantly, check the field type and its associated analyzer. In case you use a non

Re: Faceted search not working?

2010-05-24 Thread Sascha Szott
Hi Ilya, Ilya Sterin wrote: I'm trying to perform a faceted search without any luck. Result set doesn't return any facet information... http://localhost:8080/solr/select/?q=title:*facet=onfacet.field=title I'm getting the result set, but no face information present? Is there something else

Wildcard queries

2010-05-21 Thread Sascha Szott
Hi folks, what's the idea behind the fact that no text analysis (e.g. lowercasing) is performed on wildcarded search terms? In my context this behaviour seems to be counter-intuitive (I guess that's the case in the majority of applications) and my application needs to lowercase any input

Re: Wildcard queries

2010-05-21 Thread Sascha Szott
Hi Robert, thanks, you're absolutely right. I should better refine my initial question to: What's the idea behind the fact that no *lowercasing* is performed on wildcarded search terms if the field in question contains a LowercaseFilter in its associated field type definition? -Sascha

Re: Autosuggest

2010-05-15 Thread Sascha Szott
Hi, maybe you would like to have a look at solr.ShingleFilterFactory [1] to expand your autosuggest to more than one term. -Sascha [1] http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory Blargy wrote: Thanks for your help and especially your analyzer..

Re: How to tell which field matched?

2010-05-15 Thread Sascha Szott
Hi, I'm not sure if debugQuery=on is a feasible solution in a productive environment, as generating such extra information requires a reasonable amount of computation. -Sascha Jon Baer wrote: Does the standard debug component (?debugQuery=on) give you what you need?

Re: Solr Schema Question

2010-04-17 Thread Sascha Szott
Hi Serdar, take a look at Solr's DataImportHandler: http://wiki.apache.org/solr/DataImportHandler Best, Sascha Serdar Sahin wrote: Hi, I am rather new to Solr and have a question. We have around 200.000 txt files which are placed into the file cloud. The file path is something similar to

Re: StreamingUpdateSolrServer hangs

2010-04-16 Thread Sascha Szott
Hi Yonik, Yonik Seeley wrote: Stephen, were you running stock Solr 1.4, or did you apply any of the SolrJ patches? I'm trying to figure out if anyone still has any problems, or if this was fixed with SOLR-1711: I'm using the latest trunk version (rev. 934846) and constantly running into the

Re: StreamingUpdateSolrServer hangs

2010-04-16 Thread Sascha Szott
Hi Yonik, thanks for your fast reply. Yonik Seeley wrote: Thanks for the report Sascha. So after the hang, it never recovers? Some amount of hanging could be visible if there was a commit on the Solr server or something else to cause the solr requests to block for a while... but it should

Re: Deploying Solr 1.3 in JBoss 5

2010-02-05 Thread Sascha Szott
Hi Luca, could you add a note to the Wiki page [1]. Thanks! -Sascha [1] http://wiki.apache.org/solr/SolrJBoss Luca Molteni wrote: Bye the way, I finally solved it. To deploy solr 1.3 in jboss 5, you simply have to remove xercesImpl-2.8.1.jar xml-apis-1.3.03.jar From the WEB-INF/lib

Re: (default) maximum chars per field

2010-02-05 Thread Sascha Szott
markus.rietz...@rzf.fin-nrw.de wrote: ok, i was looking for all types of max but somehow didn't saw the maxFieldLength. this is a global parameter, right? can this be defined on a field basis? It's a global parameter counting the maximum number of tokens(!) - not the number of characters

Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-02 Thread Sascha Szott
Hi, can you post * the output of MySQL's describe command for all tables/views referenced in your DIH configuration * the DIH configuration file (i.e., data-config.xml) * the schema definition (i.e., schema.xml) -Sascha Jean-Michel Philippon-Nadeau wrote: Hi, It is my first install of

Re: Deploying Solr 1.3 in JBoss 5

2010-02-02 Thread Sascha Szott
you very much. L.M. -- Sascha Szott Kooperativer Bibliotheksverbund Berlin-Brandenburg (KOBV) c/o Konrad-Zuse-Zentrum fuer Informationstechnik Berlin (ZIB) Takustr. 7, D-14195 Berlin Zimmer 4357 Telefon: (030) 841 85 - 457 Telefax: (030) 841 85 - 269 E-Mail: sz...@zib.de WWW: http://www.kobv.de

Re: java.lang.NullPointerException with MySQL DataImportHandler

2010-02-02 Thread Sascha Szott
with productId 220213. Since no default value is specified, Solr raises an error when creating the index document. -Sascha Jean-Michel Philippon-Nadeau wrote: Hi, Thanks for the reply. On Tue, 2010-02-02 at 16:57 +0100, Sascha Szott wrote: * the output of MySQL's describe command for all tables/views

Re: Deploying Solr 1.3 in JBoss 5

2010-02-02 Thread Sascha Szott
Luca Molteni wrote: Actually, if I hard-code the value, it gives me the same error... interesting. According to the error message: The content of element type env-entry must match (description?,env-entry-name,env-entry-value?,env-entry-type) Maybe it helps to change the order of elements

Re: How to display Highlight with VelocityResponseWriter?

2010-01-13 Thread Sascha Szott
). -Sascha [1] http://wiki.apache.org/solr/VelocityResponseWriter#line-93 [2] http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html Quoting Sascha Szott sz...@zib.de: Qiuyan, with highlight can also be displayed in the web gui. I've added bool name=hltrue/bool

Re: How to display Highlight with VelocityResponseWriter?

2010-01-11 Thread Sascha Szott
Qiuyan, with highlight can also be displayed in the web gui. I've added bool name=hltrue/bool into the standard responseHandler and it already works, i.e without velocity. But the same line doesn't take effect in itas. Should i configure anything else? Thanks in advance. First of all, just a

Re: solrJ and spell check queries

2010-01-03 Thread Sascha Szott
Hi, Jay Fisher wrote: I'm trying to find a way to formulate the following query in solrJ. This is the only way I can get the desired result but I can't figure out how to get solrJ to generate the same query string. It always generates a url that starts with select and I need it to start with

Re: how to do a Parent/Child Mapping using entities

2009-12-30 Thread Sascha Szott
, value:url1 of res_url field is linked to value:1 of res_rank field, and all of them are linked to the commen field keyword. I think that i should use a custom field analyser or some thing like that; but i don't know what to do. but thanks for all; and any supplied help will be lovable. Sascha Szott

Re: how to do a Parent/Child Mapping using entities

2009-12-29 Thread Sascha Szott
Hi, you could create an additional index field res_ranked_url that contains the concatenated value of an url and its corresponding rank, e.g., res_rank + + res_url Then, q=res_ranked_url:1 url1 retrieves all documents with url1 as the first url. A drawback of this

Re: Optimize not having any effect on my index

2009-12-18 Thread Sascha Szott
Hi Aleksander, Aleksander Stensby wrote: So i tried with curl: curl http://server:8983/solr/update --data-binary 'optimize/' -H 'Content-type:text/xml; charset=utf-8' No difference here either... Am I doing anything wrong? Do i need to issue a commit after the optimize? Did you restart the

Re: Exception from Spellchecker

2009-12-15 Thread Sascha Szott
Hi Rafael, Rafael Pappert wrote: I try to enable the spellchecker in my 1.4.0 solr (running with tomcat 6 on debian). But I always get the following exception, when I try to open http://localhost:8080/spell?: The spellcheck=true pair is missing in your request. Try

RE: search on tomcat server

2009-12-07 Thread Sascha Szott
Hi Jill, just to make sure your index contains at least one document, what is the output of http://localhost:8080/solr/select?q=*:*debugQuery=trueechoParams=all Best, Sascha Jill Han wrote: In fact, I just followed the instructions titled as Tomcat On Windows. Here are the updates on my

How to instruct MoreLikeThisHandler to sort results

2009-12-03 Thread Sascha Szott
Hi Folks, is there any way to instruct MoreLikeThisHandler to sort results? I was wondering that MLTHandler recognizes faceting parameters among others, but it ignores the sort parameter. Best, Sascha

Re: Hierarchical xml

2009-12-02 Thread Sascha Szott
Pooja, have a look at Solr's DataImportHandler. XPathEntityProcessor [1] should suit your needs. Best, Sascha [1] http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor Pooja Verlani schrieb: Hi, I want to index an xml like following: officer nameJohn/name

Re: Indexing file content with custom field

2009-12-02 Thread Sascha Szott
Piero, it sounds you're looking for an integration of Solr Cell and Solr's DIH facility -- a feature that isn't implemented yet (but the issue is already addressed in Solr-1358). As a workaround, you could store the extracted contents in plain text files (either by using Solr Cell or Apache

[Solved] Re: VelocityResponseWriter/Solritas character encoding issue

2009-11-27 Thread Sascha Szott
encoded HTML. That's it! Best, Sascha Erik Hatcher schrieb: Sascha, Can you give me a test document that causes an issue? (maybe send me a Solr XML document in private e-mail). I'll see what I can do once I can see the issue first hand. Erik On Nov 18, 2009, at 2:48 PM, Sascha Szott

VelocityResponseWriter/Solritas character encoding issue

2009-11-18 Thread Sascha Szott
Hi, I've played around with Solr's VelocityResponseWriter (which is indeed a very useful feature for rapid prototyping). I've realized that Velocity uses ISO-8859-1 as default character encoding. I've changed this setting to UTF-8 in my velocity.properties file (inside the conf directory),

Re: VelocityResponseWriter/Solritas character encoding issue

2009-11-18 Thread Sascha Szott
the VelocityResponseWriter returns a lot of Unicode replacement characters (u+FFFD) instead. -Sascha On Nov 18, 2009, at 2:48 PM, Sascha Szott wrote: Hi, I've played around with Solr's VelocityResponseWriter (which is indeed a very useful feature for rapid prototyping). I've realized

Re: Indexing multiple documents in Solr/SolrCell

2009-11-17 Thread Sascha Szott
, Sascha Szott sz...@zib.de wrote: Hi, the problem you've described -- an integration of DataImportHandler (to traverse the XML file and get the document urls) and Solr Cell (to extract content afterwards) -- is already addressed in issue SOLR-1358 ( https://issues.apache.org/jira/browse/SOLR-1358

Re: Indexing multiple documents in Solr/SolrCell

2009-11-16 Thread Sascha Szott
Hi, the problem you've described -- an integration of DataImportHandler (to traverse the XML file and get the document urls) and Solr Cell (to extract content afterwards) -- is already addressed in issue SOLR-1358 (https://issues.apache.org/jira/browse/SOLR-1358). Best, Sascha Kerwin

Re: [DIH] blocking import operation

2009-11-12 Thread Sascha Szott
Noble Paul wrote: Yes , open an issue . This is a trivial change I've opened JIRA issue SOLR-1554. -Sascha On Thu, Nov 12, 2009 at 5:08 AM, Sascha Szott sz...@zib.de wrote: Noble, Noble Paul wrote: DIH imports are really long running. There is a good chance that the connection times out

Re: [DIH] concurrent requests to DIH

2009-11-12 Thread Sascha Szott
capabilities, though issue SOLR-1352 mainly targets the latter. Is your PDIH implementation able to deal with batch processing right now? Best, Sascha On Thu, Nov 12, 2009 at 6:35 AM, Sascha Szott sz...@zib.de wrote: Hi all, I'm using the DIH in a parameterized way by passing request parameters

Re: [DIH] blocking import operation

2009-11-11 Thread Sascha Szott
on adding a callback url to DIH a month ago, but it seems that no issue was raised. So, up to now its only possible to implement an appropriate Solr EventListener. Should we open an issue for supporting callback urls? Best, Sascha On Tue, Nov 10, 2009 at 12:12 AM, Sascha Szott sz...@zib.de wrote

[DIH] concurrent requests to DIH

2009-11-11 Thread Sascha Szott
Hi all, I'm using the DIH in a parameterized way by passing request parameters that are used inside of my data-config. All imports end up in the same index. 1. Is it considered as good practice to set up several DIH request handlers, one for each possible parameter value? 2. In case the range

[DIH] SqlEntityProcessor does not recognize onError attribute

2009-11-09 Thread Sascha Szott
Hi all, as stated in the Solr-WIKI, Solr 1.4 allows it to specify an onError attribute for *each* entity listed in the data config file (it is considered as one of the default attributes). Unfortunately, the SqlEntityProcessor does not recognize the attribute's value -- i.e., in case an SQL

Re: [DIH] SqlEntityProcessor does not recognize onError attribute

2009-11-09 Thread Sascha Szott
Hi, Noble Paul നോബിള്‍ नोब्ळ् wrote: On Mon, Nov 9, 2009 at 4:24 PM, Sascha Szott sz...@zib.de wrote: Hi all, as stated in the Solr-WIKI, Solr 1.4 allows it to specify an onError attribute for *each* entity listed in the data config file (it is considered as one of the default attributes

[DIH] blocking import operation

2009-11-09 Thread Sascha Szott
Hi all, currently, DIH's import operation(s) only works asynchronously. Therefore, after submitting an import request, DIH returns immediately, while the import process (in case a large amount of data needs to be indexed) continues asynchronously behind the scenes. So, what is the

Re: How to use DataImportHandler with ExtractingRequestHandler?

2009-09-03 Thread Sascha Szott
Hi Khai, a few weeks ago, I was facing the same problem. In my case, this workaround helped (assuming, you're using Solr 1.3): For each row, extract the content from the corresponding pdf file using a parser library of your choice (I suggest Apache PDFBox or Apache Tika in case you need to

Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
Hello, is it possible (and if it is, how can I accomplish it) to configure DIH to build up index documents by using content that resides in different data sources? Here is an example scenario: Let's assume we have a table T with two columns, ID (which is the primary key of T) and TITLE.

Re: Building documents using content residing both in database tables and text files

2009-08-11 Thread Sascha Szott
Hi Noble, Noble Paul wrote: isn't it possible to do this by having two datasources (one Js=dbc and another File) and two entities . The outer entity can read from a DB and the inner entity can read from a file. Yes, it is. Here's my db-data-config.xml file: !-- definition of data sources --