Re: Re: How to properly use Levenstein distance with ~ in Java
Hi Aleksander, The Fuzzy Searche '~' is not supported in dismax (defType=dismax) https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser You are using SearchComponent spellchecker. This does not change the query results. btw: It looks like you are using path /select with qt=dismax. This normaly would throw an exception. Is there a tag requestHandler name=/dismax ... inside your solrconfig.xml ? Best regards Karsten P.S. in Context: http://lucene.472066.n3.nabble.com/How-to-properly-use-Levenstein-distance-with-in-Java-td4164793.html On 20 October 2014 11:13, Aleksander Sadecki wrote: Ok, thank you for your response. But why I cannot use '~'?
Re: Best way to index Solr XML from w/in the same servlet container
Hi Jay, I would like to see the Zookeeper Watcher as part of DIH in solr. Possible you could extend org.apache.solr.handler.dataimport.DataSource. If you want to call solr without http you can use solrJ: org.apache.solr.client.solrj.embedded.EmbeddedSolrServer Beste regards Karsten Original-Nachricht Datum: Mon, 17 Sep 2012 13:29:53 -0700 Von: Jay Hill jayallenh...@gmail.com An: solr-user@lucene.apache.org Betreff: Best way to index Solr XML from w/in the same servlet container I've created a custom process in Solr that has a Zookeeper Watcher configured to pull Solr XML files from a znode. When I receive a file I can send the file to /update and get it indexed, but that seems inefficient. I could use SolrJ, but I believe that is still sending an HTTP request to /update. Is there a better way to do this, or is SolrJ running w/in the same servlet container the most efficient way to index SolrJ from w/in the same servlet container that is running Solr? Thanks, -Jay
Re: DataImport using last_indexed_id or getting max(id) quickly
Hi Avenka, you asked for a HowTo to add a field inverseID which allows to calculate max(id) from its first term: If you do not use solr you have to calculate 1 - id and store it in an extra field inverseID. If you fill solr with your own code, add a TrieLongField inverseID and fill with the value -id. If you only want to change schema.xml (and add some classes): * You need a new FieldType inverseLongType and a Field inverseID of Type inverseLongType * You need a line copyField source=id dest=inverseID/ (see http://wiki.apache.org/solr/SchemaXml#Copy_Fields) For inverseLongType I see two possibilities a) use TextField and make your own filter to calculate 1 - id b) extends TrieLongField to a new FieldType InverseTrieLongField with: @Override public String readableToIndexed(String val) { return super.readableToIndexed(Long.toString( -Long.parseLong(val))); } @Override public Fieldable createField(SchemaField field, String externalVal, float boost) { return super.createField(field,Long.toString( -Long.parseLong(val)), boost ); } @Override public Object toObject(Fieldable f) { Object result = super.toObject(f); if(result instanceof Long){ return new Long( -((Long)result).longValue()); } return result; } Beste regards Karsten View this message in context: http://lucene.472066.n3.nabble.com/DataImport-using-last-indexed-id-or-getting-max-id-quickly-tp3993763p3994560.html Original-Nachricht Datum: Wed, 11 Jul 2012 20:59:10 -0700 (PDT) Von: avenka ave...@gmail.com An: solr-user@lucene.apache.org Betreff: Re: DataImport using last_indexed_id or getting max(id) quickly Thanks. Can you explain more the first TermsComponent option to obtain max(id)? Do I have to modify schema.xml to add a new field? How exactly do I query for the lowest value of 1 - id? -- View this message in context: http://lucene.472066.n3.nabble.com/DataImport-using-last-indexed-id-or-getting-max-id-quickly-tp3993763p3994560.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: NRT and multi-value facet - what is Solr's limit?
Hi Andy, as long as the cache for facetting is not per segment there is no NRT together with facetting. This is what Jason told you in http://lucene.472066.n3.nabble.com/Nrt-and-caching-td3993612.html and I am agree. Possible you could use multicore. Beste regards Karsten Original-Nachricht Datum: Thu, 12 Jul 2012 03:18:47 -0700 (PDT) Von: Andy angelf...@yahoo.com An: solr-user@lucene.apache.org solr-user@lucene.apache.org Betreff: NRT and multi-value facet - what is Solr\'s limit? Hi, I understand that the cache for multi-value facet is multi-segment. So every time a document is updated the entire cache needs to be rebuilt. Is there any rule of thumb on the highest update rate NRT can handle before this cache-rebuild-on-each-commit becomes too expensive? I know it depends, but I'm just looking for order-of-magnitude estimates. Are we talking about 10 updates/s? 100? 1,000? Thanks
Re: Nrt and caching
Hi Andy, Multi-value faceting is a special case of taxonomy. So it is covered by the org.apache.lucene.facet package (lucene/facet). This is not per segment but works without per IndexSearcher cache. So imho the taxonomy faceting will work with NRT. Because of the new TermsEnum#ord() Method the class UnInvertedField already lost half of its code-lines. UnInvertedField would work per segment, if the ordinal position for a term would not change in a commit. Which is the basic idea of the taxonomy-solution. So I am quite sure that Solr will adopt this approach any time. I do not now about soon. Best regards Karsten in context: http://lucene.472066.n3.nabble.com/Nrt-and-caching-tp3993612p3993700.html Original-Nachricht Datum: Sat, 7 Jul 2012 17:32:52 -0700 (PDT) Von: Andy angelf...@yahoo.com An: solr-user@lucene.apache.org solr-user@lucene.apache.org Betreff: Re: Nrt and caching Jason, If I just use stock Solr 4.0 without modifying the source code, does that mean multi-value faceting will be very slow when I'm constantly inserting/updating documents? Which open source library are you referring to? Will Solr adopt this per-segment approach any time soon? Thanks
Re: Unable to determine why query won't return results
Hi Kurt, I toke your fieldtype definition and could not reproduce your problem with solr 3.4. But I think you have a problem with the ampersand in A. J. Johnson Co. Two comments: In your analysis html-example there is a gap of two positions between Johnson and Co. This must not be (A. J. Johnson Co. is indexed like A J Johnson Co). Possible you have an encoding problem with the ampersand? Do you use solrj for url generation? Best regards Karsten Original-Nachricht Datum: Wed, 9 Nov 2011 21:47:20 + Von: Nordstrom, Kurt kurt.nordst...@unt.edu An: solr-user@lucene.apache.org solr-user@lucene.apache.org Betreff: Unable to determine why query won\'t return results Hello all. I'm having an issue in regards to matching a quoted phrase in Solr, and I'm not certain what the issue at hand is. I have tried this on both Solr 1.3 (Our production system) and 3.3 (Our development system). The field is a text field, and has the following fieldType definition: http://pastebin.com/SkmmucUE In the case where the search is failing, the field is indexed with the following value: A. J. Johnson Co. We are searching the field with the following string (in quotes): A. J. Johnson Co. Unfortunately, we get a response of no results when searching the field in question with the above specified string. If we search merely for A. J. Johnson (with quotes), we get the desired result. Using the full string, however, seems to cause the results not to match. I have attempted to use Solr's analyzer (without success) to trace the problem. The results of this are here: http://pastehtml.com/view/bdgpdrt0w.html Any suggestions?
Re: [Profiling] How to profile/tune Solr server
Hi Spark, 2009 there was a monitor from lucidimagination: http://www.lucidimagination.com/about/news/releases/lucid-imagination-releases-performance-monitoring-utility-open-source-apache-lucene A colleague of mine calls the sematext-monitor trojan because SPM phone home: Easy in, easy out - if you try SPM and don't like it, simply stop and remove the small client-side piece that sends us your data http://sematext.com/spm/solr-performance-monitoring/index.html Looks like other people using a real profiler like YourKit Java Profiler http://forums.yourkit.com/viewtopic.php?f=3t=3850 There is also an article about Zabbix http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/ In your case any profiler would do, but if you find out a Profiler with solr-specific default-filter let me know. Best regrads Karsten P.S. eMail in context http://lucene.472066.n3.nabble.com/Profiling-How-to-profile-tune-Solr-server-td3467027.html Original-Nachricht Datum: Mon, 31 Oct 2011 18:35:32 +0800 Von: yu shen shenyu...@gmail.com An: solr-user@lucene.apache.org Betreff: Re: [Profiling] How to profile/tune Solr server No idea so far, try to figure out. Spark 2011/10/31 Jan Høydahl jan@cominvent.com Hi, There are no official tools other than looking at the built-in stats pages and perhaps using JConsole or similar JVM monitoring tools. Note that Solr's JMX capabilities may let you hook your enterprise's existing monitoring dashboard up with Solr. Also check out the new monitoring service from Sematext which will give you graphs and all. So far it's free evaluation: http://sematext.com/spm/index.html Do you have a clue for why the indexing is slow? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 31. okt. 2011, at 04:59, yu shen wrote: Hi All, I am a solr newbie. I find solr documents easy to access and use, which is really good thing. While my problem is I did not find a solr home grown profiling/monitoring tool. I set up the server as a multi-core server, each core has approximately 2GB index. And I need to update solr and re-generate index in a real time manner (In java code, using SolrJ). Sometimes the update operation is slow. And it is expected that in a year, the index size may increase to 4GB. And I need to do something to prevent performance downgrade. Is there any solr official monitoring profiling tool for this? Spark
Re: Limit by score? sort by other field
Hi Robert, take a look to http://lucene.472066.n3.nabble.com/How-to-cut-off-hits-with-score-below-threshold-td3219064.html#a3219117 and http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html So will sort=date+descq={!frange l=0.85}query($qq) qq=the original relevancy query help? Best regards Karsten Original-Nachricht Datum: Thu, 27 Oct 2011 12:30:31 +0100 Von: Robert Brown r...@intelcompute.com An: solr-user@lucene.apache.org Betreff: Limit by score? sort by other field When we display search results to our users we include a percentage score. Top result being 100%, then all others normalised based on the maxScore, calculated outside of Solr. We now want to limit returned docs with a percentage score higher than say, 50%. e.g. We want to search but only return docs scoring above 80%, but want to sort by date, hence not being able to just sort by score.
Re: data-import problem
Hi Radha Krishna, try command full-import instead of fullimport see http://wiki.apache.org/solr/DataImportHandler#Commands Best regards Karsten Original-Nachricht Datum: Mon, 24 Oct 2011 11:10:22 +0530 Von: Radha Krishna Reddy radhakrishn...@gmail.com An: solr-user@lucene.apache.org Betreff: data-import problem Hi, I am trying to comfigure solr on aws ubuntu instance.I have mysql on a different server.so i created a ssh tunnel for mysql on port 3309. Download the mysql jdbc driver and copied it to lib folder. *I edited the example/solr/conf/solrconfig.xml* ... *when i tried to import data.* http://myservername/solr/dataimport?command=fullimport i* am getting the following response* ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime5/int/lstlst name=initArgslst name=defaultsstr name=configdata-config.xml/str/lst/lststr name=commandfullimport/strstr name=statusidle/strstr name=importResponse/lst name=statusMessages/str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Can someone help me on this?Also where can i find the logs. Thanks and Regards, Radha Krishna.
Re: indexing key value pair into lucene solr index
Hi Jame, you can - generate one token for each pair (key, value) -- key_value - insert a gap between each pair and us phrase queries - use key as field-name (if you have a restricted set of keys) - wait for joins in Solr 4.0 (http://wiki.apache.org/solr/Join) - use position or payloads to connect key and value - tell the forum your exact use-case with examples Best regrads Karsten Original-Nachricht Datum: Mon, 24 Oct 2011 17:11:49 +0530 Von: jame vaalet jamevaa...@gmail.com An: solr-user@lucene.apache.org Betreff: indexing key value pair into lucene solr index hi, in my use case i have list of key value pairs in each document object, if i index them as separate index fields then in the result doc object i will get two arrays corresponding to my keys and values. The problem i face here is that there wont be any mapping between those keys and values. do we have any easy to index these data in solr ? thanks in advance ... -- -JAME
Re: indexing key value pair into lucene solr index
Hi Jame, preserve order in index fields: if you don't want to use phrase queries in key or value this order is position. if you use phrase queries but no value has more then 50 Tokens you also could use position and start each pair with position 100, 200, 300 ... Otherwise you could use payloads. Imho there is no standard way to connect the positions of two fields. You have to write your own Query. My Tip: Take org.apache.lucene.search.spans.TermSpans as starting point and use the queryparser-Module. btw: normaly there is a standard solution in lucene for each problem. So please tell more about your use-case and somebody will have an answer without program by your own. Best regards Karsten Original-Nachricht Datum: Mon, 24 Oct 2011 17:53:26 +0530 Von: jame vaalet jamevaa...@gmail.com An: solr-user@lucene.apache.org Betreff: Re: indexing key value pair into lucene solr index thanks karsten. can we preserve order within index field ? if yes, i can index them separately and map them using their order. On 24 October 2011 17:32, karsten-s...@gmx.de wrote: Hi Jame, you can - generate one token for each pair (key, value) -- key_value - insert a gap between each pair and us phrase queries - use key as field-name (if you have a restricted set of keys) - wait for joins in Solr 4.0 (http://wiki.apache.org/solr/Join) - use position or payloads to connect key and value - tell the forum your exact use-case with examples Best regrads Karsten Original-Nachricht Datum: Mon, 24 Oct 2011 17:11:49 +0530 Von: jame vaalet jamevaa...@gmail.com An: solr-user@lucene.apache.org Betreff: indexing key value pair into lucene solr index hi, in my use case i have list of key value pairs in each document object, if i index them as separate index fields then in the result doc object i will get two arrays corresponding to my keys and values. The problem i face here is that there wont be any mapping between those keys and values. do we have any easy to index these data in solr ? thanks in advance ... -- -JAME -- -JAME
Re: Can Solr handle large text files?
Hi Peter, highlighting in large text files can not be fast without dividing the original text in small piece. So take a look in http://xtf.cdlib.org/documentation/under-the-hood/#Chunking and in http://www.lucidimagination.com/blog/2010/09/16/2446/ Which means that you should divide your files and use Result Grouping / Field Collapsing to list only one hit per original document. (xtf also would solve your problem out of the box but xtf does not use solr). Best regards Karsten Original-Nachricht Datum: Thu, 20 Oct 2011 17:59:04 -0700 Von: Peter Spam ps...@mac.com An: solr-user@lucene.apache.org Betreff: Can Solr handle large text files? I have about 20k text files, some very small, but some up to 300MB, and would like to do text searching with highlighting. Imagine the text is the contents of your syslog. I would like to type in some terms, such as error and mail, and have Solr return the syslog lines with those terms PLUS two lines of context. Pretty much just like Google's highlighting. 1) Can Solr handle this? I had extremely long query times when I tried this with Solr 1.4.1 (yes I was using TermVectors, etc.). I tried breaking the files into 1MB pieces, but searching would be wonky = return the wrong number of documents (ie. if one file had a term 5 times, and that was the only file that had the term, I want 1 result, not 5 results). 2) What sort of tokenizer would be best? Here's what I'm using: field name=body type=text_pl indexed=true stored=true multiValued=false termVectors=true termPositions=true termOffsets=true / fieldType name=text_pl class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ /analyzer /fieldType Thanks! Pete
Re: Migration from Autonomy IDOL to SOLR
Hi Arcadius, currently we have a migration project from verity k2 search server to solr. I do not know IDOL, but autonomy bought verity before IDOL was released, so possible it is comparable? verity k2 works directly on xml-Files, in result the query syntax is a little bit like xpath e.g. with text1 IN zone2 IN zone1 instead of contains(//zone1/zone2,'text1'). About verity query syntax: http://gregconely.getmyip.com/dl/OTG%20Software/5.30.087%20Suite%20%28SP3%29/Disc%204%20-%20Verity/Verity%20K2%20Server%205.5/doc/docs/pdf/VerityQueryLanguage.pdf Does IDOL work the same way? Best regards Karsten P.S. in Context: http://lucene.472066.n3.nabble.com/Migration-from-Autonomy-IDOL-to-SOLR-td3255377.html Original-Nachricht Datum: Mon, 15 Aug 2011 11:11:36 +0100 Von: Arcadius Ahouansou arcad...@menelic.com An: solr-user@lucene.apache.org Betreff: Migration from Autonomy IDOL to SOLR Hello. We have a couple of application running on half a dozen Autonomy IDOL servers. Currently, all feature we need are supported by Solr. We have done some internal testing and realized that SOLR would do a better job. So, we are investigation all possibilities for a smooth migration from IDOL to SOLR. I am looking for advice from people who went through something similar. Ideally, we would like to keep most of our legacy code unchanged and have a kind of query-translation-layer plugged into our app if possible. -Is there lib available? -Any thought? Thanks. Arcadius.
Re: string cut-off filter?
Hi Bernd, I also searched for such a filter but did not found it. Best regards Karsten P.S. I am using now this filter: public class CutMaxLengthFilter extends TokenFilter { public CutMaxLengthFilter(TokenStream in) { this(in, DEFAULT_MAXLENGTH); } public CutMaxLengthFilter(TokenStream in, int maxLength) { super(in); this.maxLength = maxLength; } public static final int DEFAULT_MAXLENGTH = 15; private final int maxLength; private final CharTermAttribute termAtt = addAttribute(CharTermAttribute.class); @Override public final boolean incrementToken() throws IOException { if (!input.incrementToken()) { return false; } int length = termAtt.length(); if (maxLength 0 length maxLength) { termAtt.setLength(maxLength); } return true; } } with this factory public class CutMaxLengthFilterFactory extends BaseTokenFilterFactory { private int maxLength; @Override public void init(MapString, String args) { super.init(args); maxLength = getInt(maxLength, CutMaxLengthFilter.DEFAULT_MAXLENGTH); } public TokenStream create(TokenStream input) { return new CutMaxLengthFilter(input, maxLength); } } Original-Nachricht Datum: Mon, 08 Aug 2011 10:15:45 +0200 Von: Bernd Fehling bernd.fehl...@uni-bielefeld.de An: solr-user@lucene.apache.org Betreff: string cut-off filter? Hi list, is there a string cut-off filter to limit the length of a KeywordTokenized string? So the string should not be dropped, only limitited to a certain length. Regards Bernd
Re: Update some fields for all documents: LUCENE-1879 vs. ParallelReader .FilterIndex
Hi Erick, thanks a lot! This looks like a good idea: Our queries with the changeable fields fits the join-idea from https://issues.apache.org/jira/browse/SOLR-2272 because - we do not need relevance ranking - we can separate in a conjunction of a query with the changeable fields and our other stable fields So we can use something like q=stablefields:query1fq={!join from=changeable_fields_doc_id to:stable_fields_doc_id}changeablefields:query2 Only disprofit from the solution with ParallelReader is, that our stored fields and vector terms will be divided on two lucene-docs, which is ok in our use-case. Best regards Karsten in context: http://lucene.472066.n3.nabble.com/Update-some-fields-for-all-documents-LUCENE-1879-vs-ParallelReader-amp-FilterIndex-td3215398.html Original-Nachricht Datum: Wed, 3 Aug 2011 22:11:08 -0400 Von: Erick Erickson erickerick...@gmail.com An: solr-user@lucene.apache.org Betreff: Re: Update some fields for all documents: LUCENE-1879 vs. ParallelReader .FilterIndex Hmmm, the only thing that comes to mind is the join feature being added to Solr 4.x, but I confess I'm not entirely familiar with that functionality so can't tell if it really solver your problem. Other than that I'm out of ideas, but the again it's late and I'm tired so maybe I'm not being very creative G... Best Erick On Aug 3, 2011 11:40 AM, karsten-s...@gmx.de wrote:
Re: Update some fields for all documents: LUCENE-1879 vs. ParallelReader .FilterIndex
Hi Erick, our two changable fields are used for linking between documents on application level. From lucene point of view they are just two searchable fields with stored term vector for one of them. Our queries will use one of this fields and a couple of fields from the stable fields. So the question is really about updating two fields in an existing lucene index with more then fifty other fields. Best regards Karsten P.S. about our linking between documents: Our two fields called outgoingLinks and possibleIncomingLinks. Our source-documents have an abstract and a couple of metadata. We are using regular expression to find outgoing links in this abstract. This means a couple of words, which indicates 1. that the author made a reference (like in my previos work published as 'Very important Article' in Nature 2010, 12 page 7) 2. that this reference contains metadata to an other document Each of this links is transformed to a special key (2010NaturNr12Page7). On the other side, we transform the metadata to all possible keys. This key generation grows with our knowledge of possible link pattern. For the lucene indexer this is a black-box: There is a service which produce the keys for outgoing and possibleIncoming from our source (xml-)documents, this keys must be searchable in lucene/solr. P.P.S. in Context: http://lucene.472066.n3.nabble.com/Update-some-fields-for-all-documents-LUCENE-1879-vs-ParallelReader-amp-FilterIndex-td3215398.html Original-Nachricht Datum: Wed, 3 Aug 2011 09:57:03 -0400 Von: Erick Erickson erickerick...@gmail.com An: solr-user@lucene.apache.org Betreff: Re: Update some fields for all documents: LUCENE-1879 vs. ParallelReader .FilterIndex How are these fields used? Because if they're not used for searching, you could put them in their own core and rebuild that index at your whim, then querying that core when you need the relationship information. If you have a DB backing your system, you could perhaps store the info there and query that (but I like the second core better G).. But if you could use a separate index just for the relationships, you wouldn't have to deal with the slow re-indexing of all the docs... Best Erick On Mon, Aug 1, 2011 at 4:12 AM, karsten-s...@gmx.de wrote: Hi lucene/solr-folk, Issue: Our documents are stable except for two fields which are used for linking between the docs. So we like to update this two fields in a batch once a month (possible once a week). We can not reindex all docs once a month, because we are using XeLDA in some fields for stemming (morphological analysis), and XeLDA is slow. We have 14 Mio docs (less than 100GByte Main-Index and 3 GByte for this two changable fields). In the next half year we will migrating our search engine from verity K2 to solr; so we could wait for solr 4.0 ( btw any news about http://lucene.472066.n3.nabble.com/Release-schedule-Lucene-4-td2256958.html ? ). Solution? Our issue is exactly the purpose of ParallelReader. But Solr do not support ParallelReader (for a good reason: http://lucene.472066.n3.nabble.com/Vertical-Partitioning-advice-td494623.html#a494624 ). So I see two possible ways to solve our issue: 1. waiting for the new Parallel incremental indexing ( https://issues.apache.org/jira/browse/LUCENE-1879 ) and hoping that solr will integrate this. Pro: - nothing to do for us except waiting. Contra: - I did not found anything of the (old) patch in current trunk. 2. Change lucene index below/without solr in a batch: a) Each month generate a new index only with our two changed fields (e.g. with DIH) b) Use FilterIndex and ParallelReader to mock a correct index c) “Merge” this mock index to a new Index (via IndexWriter.addIndexes(IndexReader...) ) Pro: - The patch for https://issues.apache.org/jira/browse/LUCENE-1812 should be a good example, how to do this. Contra: - relation between DocId and document index order is not an guaranteed feature of DIH, (e.g. we will have to split the main index to ensure that no merge will occur in/after DIH). - To run this batch, solr has to be stopped and restarted. - Even if we know, that our two field should change only for a subset of the docs, we nevertheless have to reindex this two fields for all the docs. Any comments, hints or tips? Is there a third (better) way to solve our issue? Is there already an working example of the 2. solution? Will LUCENE-1879 (Parallel incremental indexing) be part of solr 4.0? Best regards Karsten
Re: xpath expression not working
Hi abhayd, XPathEntityProcessor does only support a subset of xpath, like div[@id=2] but not [id=2] Take a look to https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose I solve this problem by using xslt a preprocessor (with full xpath). The drawback is performance wasting: See http://lucene.472066.n3.nabble.com/DIH-Enhance-XPathRecordReader-to-deal-with-body-FLATTEN-true-and-body-h1-td2799005.html Best regards Karsten Original-Nachricht Datum: Mon, 1 Aug 2011 23:21:45 -0700 (PDT) Von: abhayd ajdabhol...@hotmail.com An: solr-user@lucene.apache.org Betreff: xpath expression not working hi I have a xml doc whichi would like to index using xpath entity processor. add doc id1/id detailsxyz/details /doc doc id2/id detailsxyz2/details /doc /add if i want to just load document with id=2 how would that work? I tried xpath expression that works with xpath tools but not in solr. dataConfig dataSource type=FileDataSource / document entity name=f processor=FileListEntityProcessor baseDir=c:\temp fileName=promotions.xml recursive=false rootEntity=false dataSource=null entity name=x processor=XPathEntityProcessor forEach=/add/doc url=${f.fileAbsolutePath} pk=id field column=id xpath=/add/doc/[id=2]/id/ /entity /entity /document /dataConfig Any help how i can do this? -- View this message in context: http://lucene.472066.n3.nabble.com/xpath-expression-not-working-tp3218133p3218133.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Store complete XML record (DIH XPathEntityProcessor)
Hi g, Hi Chantal I had the same problem. You can use XPathEntityProcessor but you have to insert an xsl. The drawback is performance wasting: See http://lucene.472066.n3.nabble.com/DIH-Enhance-XPathRecordReader-to-deal-with-body-FLATTEN-true-and-body-h1-td2799005.html Best regards Karsten Original-Nachricht Datum: Mon, 1 Aug 2011 12:17:45 +0200 Von: Chantal Ackermann chantal.ackerm...@btelligent.de An: solr-user@lucene.apache.org solr-user@lucene.apache.org Betreff: Re: Store complete XML record (DIH XPathEntityProcessor) Hi g, ok, I understand your problem, now. (Sorry for answering that late.) I don't think PlainTextEntityProcessor can help you. It does not take a regex. LineEntityProcessor does but your record elements probably do not come on their own line each and you wouldn't want to depend on that, anyway. I guess you would be best off writing your own entity processor - maybe by extending XPath EP if that gives you some advantage. You can of course also implement your own importer using SolrJ and your favourite XML parser framework - or any other programming language. If you are looking for a config-only solution - i'm not sure that there is one. Someone else might be able to comment on that? Cheers, Chantal On Thu, 2011-07-28 at 19:17 +0200, solruser@9913 wrote: Thanks Chantal I am ok with the second call and I already tried using that. Unfortunatly It reads the whole file into a field. My file is as below example xml record ... /record record ... /record record ... /record /xml Now the XPATH does the 'for each /record' part. For each record I also need to store the raw log in there. If I use the PlainTextEntityProcessor then it gives me the whole file (from xml .. /xml ) and not each of the record /record Am I using the PlainTextEntityProcessor wrong? THanks g -- View this message in context: http://lucene.472066.n3.nabble.com/Store-complete-XML-record-DIH-XPathEntityProcessor-tp3205524p3207203.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Matching queries on a per-element basis against a multivalued field
Hi Suk-Hyun Cho, if myFriend is the unit of retrieval you should use this as lucene document with the fields isCool gender bloodType ... if you realy want to insert all myFriends in one field like your myFriends = [ isCool=true SOME_JUNK_HERE gender=female bloodType=O, isCool=false SOME_JUNK_HERE gender=male bloodType=AB ] example, you can use SpanQueries http://www.lucidimagination.com/blog/2009/07/18/the-spanquery/ with SpanNotQuery you can search for all isCool true and gender male where no other isCool is between both phrases. Best regards Karsten P.S. see in context http://lucene.472066.n3.nabble.com/Matching-queries-on-a-per-element-basis-against-a-multivalued-field-td3217432.html
Re: How to cut off hits with score below threshold?
Hi Otis, is this the same question as http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html ? If yes, perhaps something like (http://search-lucene.com/m/4AHNF17wIJW1/) q={!frange l=0.85}query($qq) qq=the original relevancy query will help? (BTW, a also would like to specify a custom Collector via API in Solr, possible an issue?) Best regards Karsten in context: http://lucene.472066.n3.nabble.com/How-to-cut-off-hits-with-score-below-threshold-td3219064.html Original-Nachricht If one wanted to cut off hits whose score is below some threshold (I know, I know, one doesn't typically want to do this), what are the most elegant options?
Update some fields for all documents: LUCENE-1879 vs. ParallelReader .FilterIndex
Hi lucene/solr-folk, Issue: Our documents are stable except for two fields which are used for linking between the docs. So we like to update this two fields in a batch once a month (possible once a week). We can not reindex all docs once a month, because we are using XeLDA in some fields for stemming (morphological analysis), and XeLDA is slow. We have 14 Mio docs (less than 100GByte Main-Index and 3 GByte for this two changable fields). In the next half year we will migrating our search engine from verity K2 to solr; so we could wait for solr 4.0 ( btw any news about http://lucene.472066.n3.nabble.com/Release-schedule-Lucene-4-td2256958.html ? ). Solution? Our issue is exactly the purpose of ParallelReader. But Solr do not support ParallelReader (for a good reason: http://lucene.472066.n3.nabble.com/Vertical-Partitioning-advice-td494623.html#a494624 ). So I see two possible ways to solve our issue: 1. waiting for the new Parallel incremental indexing ( https://issues.apache.org/jira/browse/LUCENE-1879 ) and hoping that solr will integrate this. Pro: - nothing to do for us except waiting. Contra: - I did not found anything of the (old) patch in current trunk. 2. Change lucene index below/without solr in a batch: a) Each month generate a new index only with our two changed fields (e.g. with DIH) b) Use FilterIndex and ParallelReader to mock a correct index c) “Merge” this mock index to a new Index (via IndexWriter.addIndexes(IndexReader...) ) Pro: - The patch for https://issues.apache.org/jira/browse/LUCENE-1812 should be a good example, how to do this. Contra: - relation between DocId and document index order is not an guaranteed feature of DIH, (e.g. we will have to split the main index to ensure that no merge will occur in/after DIH). - To run this batch, solr has to be stopped and restarted. - Even if we know, that our two field should change only for a subset of the docs, we nevertheless have to reindex this two fields for all the docs. Any comments, hints or tips? Is there a third (better) way to solve our issue? Is there already an working example of the 2. solution? Will LUCENE-1879 (Parallel incremental indexing) be part of solr 4.0? Best regards Karsten
Re: Solr Configuration with 404 error
Hi rocco, you did not stop jetty after your first attempt. (You have to kill the task.) Best regards Karsten btw: How to change the port 8983: http://lucene.472066.n3.nabble.com/How-to-change-a-port-td490375.html Original-Nachricht Datum: Sun, 10 Jul 2011 20:11:54 -0700 (PDT) Von: rocco2004 steve.adams2...@gmail.com An: solr-user@lucene.apache.org Betreff: Solr Configuration with 404 error I installed Solr using: java -jar start.jar However I downloaded the source code and didn't compile it (Didn't pay attention). And the error using: http://localhost:8983/solr/admin/ was: HTTP ERROR: 404 Problem accessing /solr/admin/. Reason: NOT_FOUND I realized that it was nos configuring because the source code was not compiled. Then I downloaded the compiled version of solr but when trying to run the example configuration I'm getting exception: java.net.BindException: Address already in use Is there a way to revert solr configuration and start from scratch? Looks like the configuration got messed up. I don't see anything related to it in the manual. Here is the error: 2011-07-10 22:41:27.631:WARN::failed SocketConnector@0.0.0.0:8983: java.net.BindException: Address already in use 2011-07-10 22:41:27.632:WARN::failed Server@c4e21db: java.net.BindException: Address already in use 2011-07-10 22:41:27.632:WARN::EXCEPTION java.net.BindException: Address already in use at java.net.PlainSocketImpl.socketBind(Native Method) at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:383) at java.net.ServerSocket.bind(ServerSocket.java:328) at java.net.ServerSocket.(ServerSocket.java:194) at java.net.ServerSocket.(ServerSocket.java:150) at org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:80) at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:73) at org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:283) at org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.Server.doStart(Server.java:235) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:194) at org.mortbay.start.Main.start(Main.java:534) at org.mortbay.start.Main.start(Main.java:441) at org.mortbay.start.Main.main(Main.java:119) Jul 10, 2011 10:41:27 PM org.apache.solr.core.SolrCore registerSearcher INFO: [] Registered new searcher Searcher@5b6b9e62 main -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Configuration-with-404-error-tp3157895p3157895.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Showing facet of first N docs
Hi Tommaso, the FacetComponent works with the DocListAndSet#docSet. It should be easy to switch to DocListAndSet#docList (which contains all documents for result list (default: TOP-10, but possible 15-25 (if start=15, rows=11). Which means to change the source code. Instead of changing the source-code the easier way should be to send a second request with relevance-Filter (if your sort-criteria is relevance): http://lucene.472066.n3.nabble.com/Filter-by-relevance-td1837486.html Best regards Karsten http://lucene.472066.n3.nabble.com/Showing-facet-of-first-N-docs-td3071395.html Original-Nachricht Datum: Thu, 16 Jun 2011 12:39:32 +0200 Von: Tommaso Teofili tommaso.teof...@gmail.com An: solr-user@lucene.apache.org Betreff: Showing facet of first N docs Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso
Re: AndQueryNode to NearSpanQuery
Hi member of digitalsmiths, I also implemented SpanNearQueryNode and some QueryNodeProcessors. Most possible you can solve your problem by using QueryNode#setTag: In QueryNodeProcessor#preProcessNode you can set and remove and reset a Tag to mark the AndNodes that should became SpanNodes; after this you can use the QueryNodeProcessor#postProcessNode method to substitute this AndNodes in your OrNodes. (But be aware of https://issues.apache.org/jira/browse/LUCENE-3045 ) Best regards Karsten Original-Nachricht Datum: Mon, 13 Jun 2011 19:45:49 -0700 (PDT) Von: mtraynham mtrayn...@digitalsmiths.com An: solr-user@lucene.apache.org Betreff: AndQueryNode to NearSpanQuery ... The SpanNearQueryNode is a class I made that implements FieldableNode and extends QueryNodeImpl (as I want all Fieldable children to be from the same field, therefore just remembering the terms). Plus it maintains a distance or slop factor and a inOrder boolean. The problem here is that I can't keep the children from getting manipulated further down the pipeline, because I want my NearSpanQueryBuilder to use it's original children nodes and at the same time be cloned/changed/etc. QueryNodeImpl has many private and final methods and you can't override setChildren, etc, etc., but I'd rather stay away from monkey patching. -- View this message in context: http://lucene.472066.n3.nabble.com/AndQueryNode-to-NearSpanQuery-tp3061286p3061607.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query on Synonyms feature in Solr
Hi rajini, multi-word synonyms like private schools normally make problems. See e.g. Solr-1-4-Enterprise-Search-Server Page 56: For multi-word synonyms to work, the analysis must be applied at index-time and with expansion so that both the original words and the combined word get indexed. ... Your problem: The input of Synonym Filter must be the exact !Token! Private schools. So WhitespaceTokenizerFactory generates two tokens: private schools and for KeywordTokenizerFactory the whole text is one token. Beste regards Karsten Original-Nachricht Datum: Mon, 13 Jun 2011 16:07:35 +0530 Von: rajini maski rajinima...@gmail.com An: solr-user@lucene.apache.org Betreff: Query on Synonyms feature in Solr Synonyms feature to be enabled on documents in Solr. I have one field in solr that has the content of a document.( say field name : document_data). The data in that field is : Tamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday Synonyms for private school in synonym flat file are : Private schools,NGO Schools,Unaided schools Now when i search on this field as document_data=unaided schools. I need to get the results. What are the token, analyser filter that i can apply to the document_dataFIELD in order to get the results above This is the indexed document : add doc field name=IDSOLR200/field field name=document_dataTamil Nadu state private school fee determination committee headed by Justice Raviraja has submitted the private schools fees structure to the district educational officers on Monday/field /doc /add Right now i tried for these 2 fields type.. And i couldn't get the above results fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType fieldType name=Synonym_document class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilter synonyms=Taxonomy.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType field name=document_data type=Synonym_document indexed=true multiValued=true/ Both didn't work for my query. Anyone please guide me with the token, analyser filter that i can apply to the document_data FIELD in order to get the results above Regards, Rajani
Re: RE: Indexing Question for large dataset
Hi Joshua, what is the use-case? Do you need only the facets for one field (for each query)? Do you need all facet-values or only the first 10 in .sort=index (FACET_SORT_INDEX / numeric order) / in .sort=count (FACET_SORT_COUNT) ? How many different facet-valuss do you have per field? Do you only need this fields for faceted search? Your problem will be, that solr normaly put a int[searcher.maxDoc()] array in main-memory for each field with facets. You can avoid this by using .method=enum which should not fit in your case. Because you do not have multiToken per document, your facets will compute by SimpleFacets#getFieldCacheCounts. In Version 3.1 you will find a TODO that fits your needs :-( In this method you will also see the the method use indirectly a WeakHashMap, so if you only use 100 fields per hour you should not have a problem :-) But there will be no warm up for your application (first facet search will take a while). From my point of view you should program your own solr-PlugIn for your purpose. This is not so hard, I assure you. Best regards Karsten Joshua Name equals the product name. Each separate product can have 1 to n prices based upon pricelist. A single document represents that single product. doc field name=id1/field field name=nameThe product name./field field name=price1.00/field field name=priceList1Price0.99/field field name=priceList2Price0.98/field field name=priceList1500Price0.85/field /doc doc field name=id2/field field name=nameThe product name./field field name=price1.10/field field name=priceList1Price1.09/field field name=priceList2Price1.08/field field name=priceList1500Price1.05/field /doc Yes, the amount of pricelist could grow from 1000 to 5000 given the user base grows. There are currently about 150,000 products. We do need to index the products, since they change frequently. Thanks everyone for all your responses so far! -Original Message- From: kenf_nc [mailto:ken.fos...@realestate.com] Sent: Wednesday, April 13, 2011 1:15 PM To: solr-user@lucene.apache.org Subject: RE: Indexing Question for large dataset Is NAME a product name? Why would it be multivalue? And why would it appear on more than one document? Is each 'document' a package of products? And the pricing tiers are on the package, not individual pieces? So sounds like you could, potentially, have a PriceListX column for each user. As your User base grows, the number of columns you need may grow (you already bumped up from 2000 to 5000 in the space of a couple posts :) ). Is that right? How many products (or packages of products) do you have? Could you flip this on its ear and make a User the document. Then it could have just 3 multivalue fields (beyond any you need to identify the user like user_id) product_id product_name product_price Downside is if a new product is introduced you have to re-index all users that have a price point on that product. -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Question-for-large-dataset-tp2816344p2816994.html Sent from the Solr - User mailing list archive at Nabble.com. The recipient of this email should check this email and any attachments for the presence of viruses. The Wasserstrom Companies accepts no liability for any damage caused by any virus transmitted by this email. This footnote also confirms that this email message has been scanned for the presence of computer viruses. The Wasserstrom Companies
Re: DIH: Enhance XPathRecordReader to deal with //body(FLATTEN=true) and //body/h1
Hi Lance, your are right: XPathEntityProcessor has the attribut xsl, so I can use xslt to generate a xml-File in the form of the standard Solr update schema. I will check the performance of this. Best regards Karsten btw. flatten is an attribute of the field-Tag, not of XPathEntityProcessor (like wrongly specified it the wiki) Lance There is an option somewhere to use the full XML DOM implementation for using xpaths. The purpose of the XPathEP is to be as simple and dumb as possible and handle most cases: RSS feeds and other open standards. Search for xsl(optional) http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 Karsten On Sat, Apr 9, 2011 at 5:32 AM Hi Folks, does anyone improve DIH XPathRecordReader to deal with nested xpaths? e.g. data-config.xml with entity .. processor=XPathEntityProcessor .. field column=title xpath=//body/h1/ field column=alltext” xpath=//body flatten=true/ and the XML stream contains /html/body/h1... will only fill field “alltext” but field “title” will be empty. This is a known issue from 2009 https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose So three questions: 1. How to fill a “search over all”-Field without nested xpaths? (schema.xml copyField source=* dest=alltext/ will not help, because we lose the original token order) 2. Does anyone try to improve XPathRecordReader to deal with nested xpaths? 3. Does anyone else need this feature? Best regards Karsten http://lucene.472066.n3.nabble.com/DIH-Enhance-XPathRecordReader-to-deal-with-body-FLATTEN-true-and-body-h1-td2799005.html
Re: DIH: Enhance XPathRecordReader to deal with //body(FLATTEN=true) and //body/h1
Hi Lance, I used XPathEntityProcessor with attribut xsl and generate a xml-File in the form of the standard Solr update schema. I lost a lot of performance, it is a pity that XPathEntityProcessor does only use one thread. My tests with a collection of 350T Document: 1. use of XPathRecordReader without xslt: 28min 2. use of XPathEntityProcessor with xslt (Standard solr-war / Xalan): 44min 2. use of XPathEntityProcessor with saxon-xslt: 36min Best regards Karsten Lance There is an option somewhere to use the full XML DOM implementation for using xpaths. The purpose of the XPathEP is to be as simple and dumb as possible and handle most cases: RSS feeds and other open standards. Search for xsl(optional) http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 --karsten Hi Folks, does anyone improve DIH XPathRecordReader to deal with nested xpaths? e.g. data-config.xml with entity .. processor=XPathEntityProcessor .. field column=title xpath=//body/h1/ field column=alltext” xpath=//body flatten=true/ and the XML stream contains /html/body/h1... will only fill field “alltext” but field “title” will be empty. This is a known issue from 2009 https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose So three questions: 1. How to fill a “search over all”-Field without nested xpaths? (schema.xml copyField source=* dest=alltext/ will not help, because we lose the original token order) 2. Does anyone try to improve XPathRecordReader to deal with nested xpaths? 3. Does anyone else need this feature? Best regards Karsten http://lucene.472066.n3.nabble.com/DIH-Enhance-XPathRecordReader-to-deal-with-body-FLATTEN-true-and-body-h1-td2799005.html
DIH: Enhance XPathRecordReader to deal with //body(FLATTEN=true) and //body/h1
Hi Folks, does anyone improve DIH XPathRecordReader to deal with nested xpaths? e.g. data-config.xml with entity .. processor=XPathEntityProcessor .. field column=title xpath=//body/h1/ field column=alltext” xpath=//body flatten=true/ and the XML stream contains /html/body/h1... will only fill field “alltext” but field “title” will be empty. This is a known issue from 2009 https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose So three questions: 1. How to fill a “search over all”-Field without nested xpaths? (schema.xml copyField source=* dest=alltext/ will not help, because we lose the original token order) 2. Does anyone try to improve XPathRecordReader to deal with nested xpaths? 3. Does anyone else need this feature? Best regards Karsten
Solr without Server / Search solutions with Solr on DVD (examples?)
Hi folks, we want to migrate our search-portal to Solr. But some of our customers search in our informations offline with a DVD-Version. So we want to estimate the complexity of a Solr DVD-Version. This means to trim Solr to work on small computers with the opposite of heavy loads. So no server-optimizations, no Cache, less facet terms in memory... My question: Does anyone know examples of solutions with Solr starting from DVD? Is there a tutorial for “configure a slow Solr for Computer with little main memory”? Any best practice tips from yourself? Best regards Karsten
Re: Solr without Server / Search solutions with Solr on DVD (examples?)
Hi Ezequiel, In Solr the performance of sorting and faceted search is mainly a question of main memory. e.g Mike McCandless wrote in s.apache.org/OWK that sorting of 5m wikipedia documents by title field need 674 MB of RAM. But again: My main interest is an example of other companies/product who delivered information on DVD with stand alone Solr. Best regards Karsten ---Ezequiel Try setting a virtual machine and see its performance. I'm really not a java guy, so i really don't know how to tune it for performance... But afaik solr handles pretty well in ram if the index is static... On Thu, Apr 7, 2011 at 2:48 PM, Karsten Fissmer karsten-s...@gmx.de wrote: Hi yonik, Hi Ezequiel, Java is no problem for an DVD Version. We already have a DVD version with Servlet-Container (but this does currently not use Solr). Some of our customers work in public sector institutions and have less then 1gb main memory, but they use MS Word and IE and.. But let us say that we can set Xmx384m (we have 14m documents). Xmx384m with 14m UnitsOfRetrieval means e.g. that we do not allow the same fields for sorting as on server. My main interest is an example of other companies/product who delivered information on DVD with stand alone Solr. Best regards Karsten ---yonik Including a JRE on the DVD and a launch script that uses that JRE by default should be doable as well. -Yonik Jeffrey Even if you can ship your DVD with a jetty server, you'll still need JAVA installed on the customer machine... ---Karsten My question: Does anyone know examples of solutions with Solr starting from DVD? Is there a tutorial for “configure a slow Solr for Computer with little main memory”? Any best practice tips from yourself? -- __ Ezequiel. Http://www.ironicnet.com
sending a parsed query to solr (xml-query-parser, syntaxtree)
Hi, I am working on a migration from verity k2 to solr. At this point I have a parser for the Verity Query Language (our used subset) which generates a syntax tree. I transfer this in a couple of filters and one query. This fragmentation is the reason, why I can not use my parser inside solr (via QparserPlugin: http://wiki.apache.org/solr/SolrPlugins#QParserPlugin ). Because I have a syntax tree I like to use the QueryParser Lucene contrib module. Other reason is, that we need our own PhraseQuery, so the normal Solr Query-Syntax will not work (not even the nested queries http://www.lucidimagination.com/blog/2009/03/31/nested-queries-in-solr/ http://www.lucidimagination.com/blog/2009/02/22/exploring-query-parsers/ ) So lets say that I already have a org.apache.lucene.queryParser.core.nodes.QueryNode What is the proper way to send this to solr? 1. serialize to XML and use xml-query-parser for deserialization (but support in Solr is not stable: https://issues.apache.org/jira/browse/SOLR-839 ) 2. serialize and deserialize with XStream 3. serialize and deserialize with NamedList (like SolrJ does this in the other direction) 4. other suggestions? If 1.: Does anyone use xml-query-parser with heavy loads? If 2.: Does anyone know why QueryNodeImpl.java lost its serialVersionUID from 3.x to 4.x? Will it not longer implement Serializable? If 3.: Does anyone used NamedList to sent informations to solr? Does anyone used NamedList to represent a QueryNode (or syntax tree)? Best regards Karsten P.S. An small example why QueryParser is great: http://sujitpal.blogspot.com/2011/03/using-lucenes-new-queryparser-framework.html