Re: Features not present in Solr
I use Endeca and Solr. A few notable things in Endeca but not in Solr: 1. Real-time search. 2. "related record navigation" (RRN) is what they call it. This is the ability to join in other records, something Lucene/Solr definitely can't do. 3. A reference application for browsing/searching the data. 4. Data pipeline management software including a GUI tool to wire in different paths. I'm not a fan of this because the implementation sucks. 5. Hierarchical facets, including sifts (e.g. A-E, F-M, etc.) and attaching user meta-data to nodes (such as an id you need or something). 6. XQuery based ad-hoc querying with XML output. 7. Aggregating (e.g. rolling-up) records. IMO, the really notable things to appreciate are #1, #2, and #3, though admittedly I'm not using #1 or #2. I would consider them if money is not a problem and you really need #1 or #2. Endeca's bloat and product age is a problem. You have to run a number of installers, you have over a dozen PDFs and other help documents... I'm sometimes wondering where the heck I read something and what installer installed what. It's like comparing Oracle with perhaps PostgreSQL. And it's really annoying to have to deal with Endeca "dimension ids" (numbers) instead of Solr facet string literals because I find myself having to map them all the time. The native Java API sucks. I could complain a lot more (I've stopped myself multiple times while writing this) but this post would get out of control. It _is_ a capable product, but I'll take Solr over it any day -- at least I understand basically all of what's going on in Solr. Of course I wrote the book on it so I'm biased ;-) ~ David Smiley Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book Srikanth B wrote: > > Hello > > We are in the process of researching on Solr features. I am looking for > two > things > 1. Features not available in Solr but present in other products > like > Endeca > 2. What one shouldn't not expect from Solr > > Any thoughts ? > > Thanks in advance > Srikanth > > -- View this message in context: http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27996518.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr-ruby with clustering
false alarm, on the client side I was specifically setting a shard, and this was causing my query/solr-ruby/solr to think it was a distributed request, which isn't supported by the clustering component. cheers, mike On Mon, Mar 22, 2010 at 8:53 PM, mike anderson wrote: > Has anybody got solr-ruby to return a clustering result? (using the > clustering component) > > I'm almost certain the query is correct (I check the solr logs for the > query and run it in my browser, get back the cluster output as > expected). But when I dump the response from my solr-ruby query the > clustering output is nowhere to be found. I noticed that the > clustering output has a data type of "Arr", where the response and > other components have output of type "Lst", could this be the problem? > > If anyone can think of some other debugging I could try I'd love to hear it. > > Thanks in advance, > Mike >
solr-ruby with clustering
Has anybody got solr-ruby to return a clustering result? (using the clustering component) I'm almost certain the query is correct (I check the solr logs for the query and run it in my browser, get back the cluster output as expected). But when I dump the response from my solr-ruby query the clustering output is nowhere to be found. I noticed that the clustering output has a data type of "Arr", where the response and other components have output of type "Lst", could this be the problem? If anyone can think of some other debugging I could try I'd love to hear it. Thanks in advance, Mike
Re: synonyms problem
How large is the document, and how often does 'aberrant' appear in it? Are the other words also in the document? What is the full analysis stack? There might be interactions between the SynonymFilter and other filters. What does the admin/analysis.jsp page show? Does it throw OutOfMemory also? Does stemming turn two of the terms into the same term? On Mon, Mar 22, 2010 at 7:48 AM, Armando Ota wrote: > Have you tried increasing memory size ? > > we had some out of memory problems when we used default memory size .. > > Kind regards > > Armando > > michaelnazaruk wrote: >> >> Hi all! I have a little problem with synonyms: >> when I set my synonyms.txt file such as: >> >> aberrant=>abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical >> it's all right! But if I set this file such as >> >> aberrant,abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical >> I get exception that not enough memory >> >> > -- Lance Norskog goks...@gmail.com
Re: Features not present in Solr
Hmm... sounds pretty much like what this book should be about (once finished): http://www.manning.com/ingersoll/ On Mon, Mar 22, 2010 at 8:46 PM, Lance Norskog wrote: > About Text Analysis: "Natural Language Processing" is the more usual > term. Finding parts of speech, isolating people's names, etc. > > On Mon, Mar 22, 2010 at 12:27 PM, Israel Ekpo > wrote: > > On Mon, Mar 22, 2010 at 3:16 PM, Lance Norskog > wrote: > > > >> Web crawling. > > > > > > I don't think Solr was designed with Web Crawling in mind. Nutch would be > > more better suited for that, I believe. > > > > > >> Text analysis. > >> > > > > This is a bit vague. > > > > Please elaborate further. There is a lot of analysis (stemming, stop-word > > removal, character transformation etc) that takes place already though > > implicitly based on what fields you define and use in the schema. > > > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters > > > > > >> Distributed index management. > >> A fanatical devotion to the Pope. > >> > >> There a probably a lot of features already available in Solr out of the > box > > that most of those other "enterprise level" applications do not have yet. > > > > You would also be surprised to learn that a lot of them use Lucene under > the > > covers and are actually trying to re-implement what is already available > in > > Solr. > > > > > >> On Sun, Mar 21, 2010 at 11:19 PM, MitchK wrote: > >> > > >> > Srikanth, > >> > > >> > I don't know anything about Endeca, so I can't compare Solr to it. > >> > However, I know Solr is powerful. Very powerful. > >> > So, maybe you should tell us more about your needs to get a good > answer. > >> > > >> > As a response to your second question: You should not expect that Solr > is > >> > a database. It is an index-server. A database makes your data save. If > >> there > >> > goes something wrong - which is always possible - Solr gives no > >> warranties. > >> > Maybe someone other can tell you more about this topic. > >> > > >> > - Mitch > >> > > >> > > >> > Srikanth B wrote: > >> >> > >> >> Hello > >> >> > >> >> We are in the process of researching on Solr features. I am looking > for > >> >> two > >> >> things > >> >> 1. Features not available in Solr but present in other > products > >> >> like > >> >> Endeca > >> >> 2. What one shouldn't not expect from Solr > >> >> > >> >> Any thoughts ? > >> >> > >> >> Thanks in advance > >> >> Srikanth > >> >> > >> >> > >> > > >> > -- > >> > View this message in context: > >> > http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html > >> > Sent from the Solr - User mailing list archive at Nabble.com. > >> > > >> > > >> > >> > >> > >> -- > >> Lance Norskog > >> goks...@gmail.com > >> > > > > > > > > -- > > "Good Enough" is not good enough. > > To give anything less than your best is to sacrifice the gift. > > Quality First. Measure Twice. Cut Once. > > http://www.israelekpo.com/ > > > > > > -- > Lance Norskog > goks...@gmail.com >
Re: Features not present in Solr
About Text Analysis: "Natural Language Processing" is the more usual term. Finding parts of speech, isolating people's names, etc. On Mon, Mar 22, 2010 at 12:27 PM, Israel Ekpo wrote: > On Mon, Mar 22, 2010 at 3:16 PM, Lance Norskog wrote: > >> Web crawling. > > > I don't think Solr was designed with Web Crawling in mind. Nutch would be > more better suited for that, I believe. > > >> Text analysis. >> > > This is a bit vague. > > Please elaborate further. There is a lot of analysis (stemming, stop-word > removal, character transformation etc) that takes place already though > implicitly based on what fields you define and use in the schema. > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters > > >> Distributed index management. >> A fanatical devotion to the Pope. >> >> There a probably a lot of features already available in Solr out of the box > that most of those other "enterprise level" applications do not have yet. > > You would also be surprised to learn that a lot of them use Lucene under the > covers and are actually trying to re-implement what is already available in > Solr. > > >> On Sun, Mar 21, 2010 at 11:19 PM, MitchK wrote: >> > >> > Srikanth, >> > >> > I don't know anything about Endeca, so I can't compare Solr to it. >> > However, I know Solr is powerful. Very powerful. >> > So, maybe you should tell us more about your needs to get a good answer. >> > >> > As a response to your second question: You should not expect that Solr is >> > a database. It is an index-server. A database makes your data save. If >> there >> > goes something wrong - which is always possible - Solr gives no >> warranties. >> > Maybe someone other can tell you more about this topic. >> > >> > - Mitch >> > >> > >> > Srikanth B wrote: >> >> >> >> Hello >> >> >> >> We are in the process of researching on Solr features. I am looking for >> >> two >> >> things >> >> 1. Features not available in Solr but present in other products >> >> like >> >> Endeca >> >> 2. What one shouldn't not expect from Solr >> >> >> >> Any thoughts ? >> >> >> >> Thanks in advance >> >> Srikanth >> >> >> >> >> > >> > -- >> > View this message in context: >> http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html >> > Sent from the Solr - User mailing list archive at Nabble.com. >> > >> > >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> > > > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. > http://www.israelekpo.com/ > -- Lance Norskog goks...@gmail.com
Re: Query interface
There are several response formats available for Solr: http://wiki.apache.org/solr/QueryResponseWriter Also, XSLT scripts and Velocity scripts are available for pre-processing output formats. On Mon, Mar 22, 2010 at 9:00 AM, Armando Ota wrote: > Hey ... > > Thank you very much .. been strugling with this for hours now :( > > Will have to change the feature .. somehow :D > > Kind regards > > Armando > > Abdelhamid ABID wrote: >> >> Hi, >> I think there isn't better than using XSLT as a mean to query solr and >> render results. >> Within an xslt file you would combine search form with search results in >> one >> place, by this way you free the server from the heavy duty tasks of xslt >> transformation and let the client -which is in the most cases a browser- >> do >> the work. >> >> On 3/22/10, Gora Mohanty wrote: >> >>> >>> On Mon, 22 Mar 2010 15:26:41 +0100 >>> Sebastian Funk wrote: >>> >>> hey there, i've been using solr for some time now and set everything up the way it's supposed to.. now for the user interface: simply writing a javascript (or something else) website that passes the query-URL to solr and interprets the XML given as a result. is that the easiest way? i've noticed some problems with umlauts etc.. when using jetty or tomcat as a server.. is there another way to query solr and retrieve the results? >>> >>> [...] >>> >>> Many modern frameworks (I certainly know of Ruby on Rails, and >>> Django), have Solr integrated via an application. I really like >>> Django Haystack for how it offers an easy way to get started with >>> various search back-ends, with a very Django-ish feel to the >>> interface: http://haystacksearch.org/ >>> >>> Regards, >>> >>> Gora >>> >>> >> >> >> >> > -- Lance Norskog goks...@gmail.com
Re: DIH - Categories not indexed ????
Whoops, yes it is in the wiki. A link from the admin page would be welcome. On Mon, Mar 22, 2010 at 12:37 PM, Lance Norskog wrote: > There is a very cool debugger for the DataImportHandler: > > http://www.lucidimagination.com/search/document/CDRG_ch06_6.4.9?q=dataimport > debug jsp > > It is not mentioned on the wiki, nor are there any links to it in the > Solr admin console. > > On Mon, Mar 22, 2010 at 8:36 AM, stocki wrote: >> >> Helloo. >> >> i have the same database like in this example: >> http://wiki.apache.org/solr/DataImportHandler?highlight=(dih)#Full_Import_Example >> >> this is my data-config.xml >> >> >> > query="select id, shop_id, is_active, order_index, >> shop_item_number, manufacturer, name, ean, isbn, modified from shop_items"> >> >> >> >> >> >> >> >> >> >> > dateTimeFormat="-MM-'hh:mm:ss'Z'" /> >> >> >> > name="shop_category_id" /> >> >> >> >> >> >> >> >> >> >> >> i have absolute no idea why solr didnt index the category name and >> category_id... >> >> one product can have more than one values. >> >> please help meee someone .. ^^ ;) >> >> -- >> View this message in context: >> http://old.nabble.com/DIH---Categories-not-indexed--tp27988126p27988126.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > Lance Norskog > goks...@gmail.com > -- Lance Norskog goks...@gmail.com
Re: DIH - Categories not indexed ????
There is a very cool debugger for the DataImportHandler: http://www.lucidimagination.com/search/document/CDRG_ch06_6.4.9?q=dataimport debug jsp It is not mentioned on the wiki, nor are there any links to it in the Solr admin console. On Mon, Mar 22, 2010 at 8:36 AM, stocki wrote: > > Helloo. > > i have the same database like in this example: > http://wiki.apache.org/solr/DataImportHandler?highlight=(dih)#Full_Import_Example > > this is my data-config.xml > > > query="select id, shop_id, is_active, order_index, > shop_item_number, manufacturer, name, ean, isbn, modified from shop_items"> > > > > > > > > > > dateTimeFormat="-MM-'hh:mm:ss'Z'" /> > > > name="shop_category_id" /> > > > > > > > > > > > i have absolute no idea why solr didnt index the category name and > category_id... > > one product can have more than one values. > > please help meee someone .. ^^ ;) > > -- > View this message in context: > http://old.nabble.com/DIH---Categories-not-indexed--tp27988126p27988126.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Lance Norskog goks...@gmail.com
Re: Features not present in Solr
On Mon, Mar 22, 2010 at 3:16 PM, Lance Norskog wrote: > Web crawling. I don't think Solr was designed with Web Crawling in mind. Nutch would be more better suited for that, I believe. > Text analysis. > This is a bit vague. Please elaborate further. There is a lot of analysis (stemming, stop-word removal, character transformation etc) that takes place already though implicitly based on what fields you define and use in the schema. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters > Distributed index management. > A fanatical devotion to the Pope. > > There a probably a lot of features already available in Solr out of the box that most of those other "enterprise level" applications do not have yet. You would also be surprised to learn that a lot of them use Lucene under the covers and are actually trying to re-implement what is already available in Solr. > On Sun, Mar 21, 2010 at 11:19 PM, MitchK wrote: > > > > Srikanth, > > > > I don't know anything about Endeca, so I can't compare Solr to it. > > However, I know Solr is powerful. Very powerful. > > So, maybe you should tell us more about your needs to get a good answer. > > > > As a response to your second question: You should not expect that Solr is > > a database. It is an index-server. A database makes your data save. If > there > > goes something wrong - which is always possible - Solr gives no > warranties. > > Maybe someone other can tell you more about this topic. > > > > - Mitch > > > > > > Srikanth B wrote: > >> > >> Hello > >> > >> We are in the process of researching on Solr features. I am looking for > >> two > >> things > >> 1. Features not available in Solr but present in other products > >> like > >> Endeca > >> 2. What one shouldn't not expect from Solr > >> > >> Any thoughts ? > >> > >> Thanks in advance > >> Srikanth > >> > >> > > > > -- > > View this message in context: > http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > > > -- > Lance Norskog > goks...@gmail.com > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Features not present in Solr
On Mon, Mar 22, 2010 at 8:16 PM, Lance Norskog wrote: > Web crawling. > Nutch, Lucene Conectors Framework... would it help to include this directly into Solr code base? > Text analysis. > Under development I think, see Mahout (check some proposed GSoC tickets in JIRA) > Distributed index management. > A fanatical devotion to the Pope. > > On Sun, Mar 21, 2010 at 11:19 PM, MitchK wrote: > > > > Srikanth, > > > > I don't know anything about Endeca, so I can't compare Solr to it. > > However, I know Solr is powerful. Very powerful. > > So, maybe you should tell us more about your needs to get a good answer. > > > > As a response to your second question: You should not expect that Solr is > > a database. It is an index-server. A database makes your data save. If > there > > goes something wrong - which is always possible - Solr gives no > warranties. > > Maybe someone other can tell you more about this topic. > > > > - Mitch > > > > > > Srikanth B wrote: > >> > >> Hello > >> > >> We are in the process of researching on Solr features. I am looking for > >> two > >> things > >> 1. Features not available in Solr but present in other products > >> like > >> Endeca > >> 2. What one shouldn't not expect from Solr > >> > >> Any thoughts ? > >> > >> Thanks in advance > >> Srikanth > >> > >> > > > > -- > > View this message in context: > http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > > > -- > Lance Norskog > goks...@gmail.com >
Re: Question about query
One thing I've seen suggested is to add the number of values to a separate field, say topic_count. Then, in your situation above you could append "AND topic_count=1". This can extend to work if you wanted any number of matches (and only that number). For instance, topic=5 AND topic=10 AND topic=20 AND topic_count=3 would give you article 4. Don't know if this works in your particular situation Erick On Mon, Mar 22, 2010 at 10:32 AM, Armando Ota wrote: > Hi > > I need a little help with query for my problem (if it can be solved) > > I have a field in a document called topic > > this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, > etc ... > > It can contain many values like 1, 10, 50, etc (for 1 doc) > > So now to the problem: > I would like to get documents that have 0 for topic value and documents > that only have for example 1 for topic value inserted > > articles for example: > article 1topics: 1, 5, 10, 20, 24 > article 2 topics: 0 > article 3 topics: 1 > article 4 topic: 5, 10, 20 > article 5 topic: 1, 13, 19 > > So I need search query to return me only article 2 and 3 not other articles > with 1 for topic value > > Can that be done ? Any help appreciated > > Kind regards > > Armando > >
Re: Features not present in Solr
Web crawling. Text analysis. Distributed index management. A fanatical devotion to the Pope. On Sun, Mar 21, 2010 at 11:19 PM, MitchK wrote: > > Srikanth, > > I don't know anything about Endeca, so I can't compare Solr to it. > However, I know Solr is powerful. Very powerful. > So, maybe you should tell us more about your needs to get a good answer. > > As a response to your second question: You should not expect that Solr is > a database. It is an index-server. A database makes your data save. If there > goes something wrong - which is always possible - Solr gives no warranties. > Maybe someone other can tell you more about this topic. > > - Mitch > > > Srikanth B wrote: >> >> Hello >> >> We are in the process of researching on Solr features. I am looking for >> two >> things >> 1. Features not available in Solr but present in other products >> like >> Endeca >> 2. What one shouldn't not expect from Solr >> >> Any thoughts ? >> >> Thanks in advance >> Srikanth >> >> > > -- > View this message in context: > http://old.nabble.com/Features-not-present-in-Solr-tp27966315p27982734.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- Lance Norskog goks...@gmail.com
Re: SOLR-1316 How To Implement this autosuggest component ???
i patch an nightly build from solr. patch runs, classes are in the correct folder, but when i replace spellcheck with this spellchecl like in the comments, solr cannot find the classes =( suggest org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.jaspell.JaspellLookup text american-english --> SCHWERWIEGEND: org.apache.solr.common.SolrException: Error loading class 'org.ap ache.solr.spelling.suggest.Suggester' why is it so ?? i think no one has so many trouble to run a patch like me =( :D Andrzej Bialecki wrote: > > On 2010-03-19 13:03, stocki wrote: >> >> hello.. >> >> i try to implement autosuggest component from these link: >> http://issues.apache.org/jira/browse/SOLR-1316 >> >> but i have no idea how to do this !?? can anyone get me some tipps ? > > Please follow the instructions outlined in the JIRA issue, in the > comment that shows fragments of XML config files. > > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > > > -- View this message in context: http://old.nabble.com/SOLR-1316-How-To-Implement-this-autosuggest-component-tp27950949p27990809.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Correct way to use tokenizer for whitespace
> Thank you. I tried that but it did > not work to remove trailing spaces. > I believe this is why my size facet queries are not > working. After > reloading, the XML result entries still have: > > > LARGE > MEDIUM > SMALL > > > I am using this: > > > class="solr.StandardTokenizerFactory"/> > > > > And here is my size field: > indexed="true" stored="true" > multiValued="true" required="false"/> The problem is you are using string type (type="string") here. Which is not analyzed. It should be :
Re: Correct way to use tokenizer for whitespace
Thank you. I tried that but it did not work to remove trailing spaces. I believe this is why my size facet queries are not working. After reloading, the XML result entries still have: LARGE MEDIUM SMALL I am using this: And here is my size field: I did not know what difference this does: vs this: But it appears I do not need that part. On Mon, Mar 22, 2010 at 2:12 PM, Ahmet Arslan wrote: > >> In my schema.xml, I am trying to remove whitespace from a >> multivalued >> field as they come from the database. Is this the correct >> way: >> >> > class="solr.TextField"> >> >> > class="solr.StandardTokenizerFactory"/> >> > class="solr.TrimFilterFactory" /> >> >> >> >> I do not believe this is working. > > TrimFilterFactory trims leading and trailing white-spaces. But > StandardTokenizerFactory already eats up white-spaces. In other words it is > meaningless to use it with StandardTokenizerFactory. > > In your field type definition you specified only query analyzer but not index > analyzer. You can use this directly: > > > > > > > > What do you mean by removing whitespace from a multivalued field as they come > from the database? > > > >
Re: Correct way to use tokenizer for whitespace
> In my schema.xml, I am trying to remove whitespace from a > multivalued > field as they come from the database. Is this the correct > way: > > class="solr.TextField"> > > class="solr.StandardTokenizerFactory"/> > class="solr.TrimFilterFactory" /> > > > > I do not believe this is working. TrimFilterFactory trims leading and trailing white-spaces. But StandardTokenizerFactory already eats up white-spaces. In other words it is meaningless to use it with StandardTokenizerFactory. In your field type definition you specified only query analyzer but not index analyzer. You can use this directly: What do you mean by removing whitespace from a multivalued field as they come from the database?
Correct way to use tokenizer for whitespace
Hi, In my schema.xml, I am trying to remove whitespace from a multivalued field as they come from the database. Is this the correct way: I do not believe this is working. Thanks!
Re: Multi Select Facets through Java API
With your eaxmple I got it working nicely with addFacetField and addFilterQuery in the API. Thanks, I appreciate the help. Britske wrote: > > something like this? > > q=mainquery&fq={!tag=carfq}cars:corvette OR > cars:camaro&facet=on&facet.field={!ex=carfq key=carfacet}cars > > -the facet: "carfacet" is indepedennt of the filter query that filters on > cars. > -you construct the filter query (fq={!tag=carfq}cars:corvette OR > cars:camaro) yourself in your application layer. > > perhaps a disadvantage is that you get a lot of different filter queries > which are all independently cached... I don't see any other way at the > moment though.. > > Geert-Jan > > > > 2010/3/22 homerlex > >> >> bump - anyone? >> -- >> View this message in context: >> http://old.nabble.com/Multi-Select-Facets-through-Java-API-tp27951014p27986301.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://old.nabble.com/Multi-Select-Facets-through-Java-API-tp27951014p27989508.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: use termscomponent like spellComponent ?!
thx. it try to patch solr with 1316 but it not works =( do i need to checkout from svn Nightly ? http://svn.apache.org/repos/asf/lucene/solr/ when i create a patch and then create the WAR it has only 40 MB ... Grant Ingersoll-6 wrote: > > See https://issues.apache.org/jira/browse/SOLR-1316 > > > On Mar 21, 2010, at 2:34 PM, stocki wrote: > >> >> hello. >> >> i play with solr but i didn`t find the perfect solution for me. >> >> my goal is a search like the amazonsearch from the iPhoneApp. ;) >> >> it is possible to use the TermsComponent like the SpellComponent ? So, >> that >> works termsComp with more than one single Term ?! >> >> i got these 3 docs with the name in my index: >> - nikon one >> - nikon two >> - nikon three >> >> so when ich search for "nik" termsCom suggest me "nikon". thats >> correctly >> whar i want. >> but when i type "nikon on" i want that solr suggest me "nikon one" , >> >> how is that realizable ??? pleeease help me somebody ;) >> >> a merge of TC nad SC where best solution in think so. >> >> > required="true" /> >> this is my searchfield. did i use the correct type ? >> >> >> -- >> View this message in context: >> http://old.nabble.com/use-termscomponent-like-spellComponent--%21-tp27977008p27977008.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > > > -- View this message in context: http://old.nabble.com/use-termscomponent-like-spellComponent--%21-tp27977008p27988620.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query interface
Hey ... Thank you very much .. been strugling with this for hours now :( Will have to change the feature .. somehow :D Kind regards Armando Abdelhamid ABID wrote: Hi, I think there isn't better than using XSLT as a mean to query solr and render results. Within an xslt file you would combine search form with search results in one place, by this way you free the server from the heavy duty tasks of xslt transformation and let the client -which is in the most cases a browser- do the work. On 3/22/10, Gora Mohanty wrote: On Mon, 22 Mar 2010 15:26:41 +0100 Sebastian Funk wrote: hey there, i've been using solr for some time now and set everything up the way it's supposed to.. now for the user interface: simply writing a javascript (or something else) website that passes the query-URL to solr and interprets the XML given as a result. is that the easiest way? i've noticed some problems with umlauts etc.. when using jetty or tomcat as a server.. is there another way to query solr and retrieve the results? [...] Many modern frameworks (I certainly know of Ruby on Rails, and Django), have Solr integrated via an application. I really like Django Haystack for how it offers an easy way to get started with various search back-ends, with a very Django-ish feel to the interface: http://haystacksearch.org/ Regards, Gora
Re: Query interface
Hi, I think there isn't better than using XSLT as a mean to query solr and render results. Within an xslt file you would combine search form with search results in one place, by this way you free the server from the heavy duty tasks of xslt transformation and let the client -which is in the most cases a browser- do the work. On 3/22/10, Gora Mohanty wrote: > > On Mon, 22 Mar 2010 15:26:41 +0100 > Sebastian Funk wrote: > > > hey there, > > > > i've been using solr for some time now and set everything up the > > way it's supposed to.. > > now for the user interface: simply writing a javascript (or > > something else) website that passes the query-URL to solr and > > interprets the XML given as a result. is that the easiest way? > > i've noticed some problems with umlauts etc.. when using jetty or > > tomcat as a server.. > > > > is there another way to query solr and retrieve the results? > > [...] > > Many modern frameworks (I certainly know of Ruby on Rails, and > Django), have Solr integrated via an application. I really like > Django Haystack for how it offers an easy way to get started with > various search back-ends, with a very Django-ish feel to the > interface: http://haystacksearch.org/ > > Regards, > > Gora > -- Abdelhamid ABID Software Engineer- J2EE / WEB / ESB MULE
Re: use termscomponent like spellComponent ?!
See https://issues.apache.org/jira/browse/SOLR-1316 On Mar 21, 2010, at 2:34 PM, stocki wrote: > > hello. > > i play with solr but i didn`t find the perfect solution for me. > > my goal is a search like the amazonsearch from the iPhoneApp. ;) > > it is possible to use the TermsComponent like the SpellComponent ? So, that > works termsComp with more than one single Term ?! > > i got these 3 docs with the name in my index: > - nikon one > - nikon two > - nikon three > > so when ich search for "nik" termsCom suggest me "nikon". thats correctly > whar i want. > but when i type "nikon on" i want that solr suggest me "nikon one" , > > how is that realizable ??? pleeease help me somebody ;) > > a merge of TC nad SC where best solution in think so. > > required="true" /> > this is my searchfield. did i use the correct type ? > > > -- > View this message in context: > http://old.nabble.com/use-termscomponent-like-spellComponent--%21-tp27977008p27977008.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search
Re: Question about query
Hey Thank you for your reply .. but it's not working ... I still get other articles Kind regards Armando Abdelhamid ABID wrote: Well, here what I figure out ! (mm=1<50% , qf=topic , q="1" "0" ) ==> q=topic:0 or topic:1 On 3/22/10, Armando Ota wrote: Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando
DIH - Categories not indexed ????
Helloo. i have the same database like in this example: http://wiki.apache.org/solr/DataImportHandler?highlight=(dih)#Full_Import_Example this is my data-config.xml i have absolute no idea why solr didnt index the category name and category_id... one product can have more than one values. please help meee someone .. ^^ ;) -- View this message in context: http://old.nabble.com/DIH---Categories-not-indexed--tp27988126p27988126.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Question about query
Well, here what I figure out ! (mm=1<50% , qf=topic , q="1" "0" ) ==> q=topic:0 or topic:1 On 3/22/10, Armando Ota wrote: > > Hi > > I need a little help with query for my problem (if it can be solved) > > I have a field in a document called topic > > this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, > etc ... > > It can contain many values like 1, 10, 50, etc (for 1 doc) > > So now to the problem: > I would like to get documents that have 0 for topic value and documents > that only have for example 1 for topic value inserted > > articles for example: > article 1topics: 1, 5, 10, 20, 24 > article 2 topics: 0 > article 3 topics: 1 > article 4 topic: 5, 10, 20 > article 5 topic: 1, 13, 19 > > So I need search query to return me only article 2 and 3 not other articles > with 1 for topic value > > Can that be done ? Any help appreciated > > Kind regards > > Armando > > -- Elsadek Software Engineer- J2EE / WEB / ESB MULE
Re: synonyms problem
Have you tried increasing memory size ? we had some out of memory problems when we used default memory size .. Kind regards Armando michaelnazaruk wrote: Hi all! I have a little problem with synonyms: when I set my synonyms.txt file such as: aberrant=>abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical it's all right! But if I set this file such as aberrant,abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical I get exception that not enough memory
synonyms problem
Hi all! I have a little problem with synonyms: when I set my synonyms.txt file such as: aberrant=>abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical it's all right! But if I set this file such as aberrant,abnormal,unusual,deviant,anomalous,peculiar,uncharacteristic,irregular,atypical I get exception that not enough memory -- View this message in context: http://old.nabble.com/synonyms-problem-tp27987378p27987378.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr crashing while extracting from very simple text file
I thought you might ask that :-) It's because the pdf files are scanned from paper documents and OCR'd to produce text. They still contain the image so are huge. The smaller files are about 40 MB and cause a Java out of heap memory error. The larger files are getting close to 500 MB. I didn't have anything to do with the scanning. I'm guessing but it seems that something in the Tomcat / Solr / Tika implementation tries to load it all into memory at once. pdftotext (part of http://www.foolabs.com/xpdf/download.html ) seems to do it nicely and processes small chunks at a time. Ross On Mon, Mar 22, 2010 at 9:43 AM, Erik Hatcher wrote: > Why not feed the original PDF files in instead? Just curious if pdftotext > is doing a better job than Tika's PDFBox stuff. > > Erik > > On Mar 22, 2010, at 9:30 AM, Ross wrote: > >> Thanks Georg >> >> I don't think it's that because it crashes on a one word test file I >> create using the nano editor. I don't think nano is adding anything >> extra. >> >> My real files are created by a Windows utility called pdftotext. I >> solved the problem by getting pdftotext to generate html files rather >> than plain text. It just adds an html header and wraps everything in a >> tag. That seems to keep Solr happy. >> >> Ross >> >> On Mon, Mar 22, 2010 at 9:08 AM, György Frivolt >> wrote: >>> >>> Hi, >>> >>> I had problem with indexing documents some months ago as well. I found >>> that there were XML control characters in the documents and these were >>> not >>> handled by Solr. Maybe it is the case for you as well. >>> >>> Regards, >>> >>> Georg >>> >>> >>> On Sun, Mar 21, 2010 at 5:58 PM, Ross wrote: >>> Hi all I'm trying to import some text files. I'm mostly following Avi Rappoport's tutorial. Some of my files cause Solr to crash while indexing. I've narrowed it down to a very simple example. I have a file named test.txt with one line. That line is the word XXBLE and nothing else This is the command I'm using. curl " http://localhost:8080/solr-example/update/extract?literal.id=1&commit=true " -F "myfi...@test.txt" The result is pasted below. Other files work just fine. The problem seems to be related to the letters B and E. If I change them to something else or make them lower case then it works. In my real files, the XX is something else but the result is the same. It's a common word in the files. I guess for this "quick and dirty" job I'm doing I could do a bulk replace in the files to make it lower case. Is there any workaround for this? Thanks Ross Apache Tomcat/6.0.20 - Error report HTTP Status 500 - org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache
Question about query
Hi I need a little help with query for my problem (if it can be solved) I have a field in a document called topic this field contains some values, 0 (for no topic) or 1 (topic 1), 2, 3, etc ... It can contain many values like 1, 10, 50, etc (for 1 doc) So now to the problem: I would like to get documents that have 0 for topic value and documents that only have for example 1 for topic value inserted articles for example: article 1topics: 1, 5, 10, 20, 24 article 2 topics: 0 article 3 topics: 1 article 4 topic: 5, 10, 20 article 5 topic: 1, 13, 19 So I need search query to return me only article 2 and 3 not other articles with 1 for topic value Can that be done ? Any help appreciated Kind regards Armando
Re: Query interface
On Mon, 22 Mar 2010 15:26:41 +0100 Sebastian Funk wrote: > hey there, > > i've been using solr for some time now and set everything up the > way it's supposed to.. > now for the user interface: simply writing a javascript (or > something else) website that passes the query-URL to solr and > interprets the XML given as a result. is that the easiest way? > i've noticed some problems with umlauts etc.. when using jetty or > tomcat as a server.. > > is there another way to query solr and retrieve the results? [...] Many modern frameworks (I certainly know of Ruby on Rails, and Django), have Solr integrated via an application. I really like Django Haystack for how it offers an easy way to get started with various search back-ends, with a very Django-ish feel to the interface: http://haystacksearch.org/ Regards, Gora
Query interface
hey there, i've been using solr for some time now and set everything up the way it's supposed to.. now for the user interface: simply writing a javascript (or something else) website that passes the query-URL to solr and interprets the XML given as a result. is that the easiest way? i've noticed some problems with umlauts etc.. when using jetty or tomcat as a server.. is there another way to query solr and retrieve the results? thanks for any help, sebastian funk
Re: Multi Select Facets through Java API
something like this? q=mainquery&fq={!tag=carfq}cars:corvette OR cars:camaro&facet=on&facet.field={!ex=carfq key=carfacet}cars -the facet: "carfacet" is indepedennt of the filter query that filters on cars. -you construct the filter query (fq={!tag=carfq}cars:corvette OR cars:camaro) yourself in your application layer. perhaps a disadvantage is that you get a lot of different filter queries which are all independently cached... I don't see any other way at the moment though.. Geert-Jan 2010/3/22 homerlex > > bump - anyone? > -- > View this message in context: > http://old.nabble.com/Multi-Select-Facets-through-Java-API-tp27951014p27986301.html > Sent from the Solr - User mailing list archive at Nabble.com. > >
Re: Solr crashing while extracting from very simple text file
Why not feed the original PDF files in instead? Just curious if pdftotext is doing a better job than Tika's PDFBox stuff. Erik On Mar 22, 2010, at 9:30 AM, Ross wrote: Thanks Georg I don't think it's that because it crashes on a one word test file I create using the nano editor. I don't think nano is adding anything extra. My real files are created by a Windows utility called pdftotext. I solved the problem by getting pdftotext to generate html files rather than plain text. It just adds an html header and wraps everything in a tag. That seems to keep Solr happy. Ross On Mon, Mar 22, 2010 at 9:08 AM, György Frivolt wrote: Hi, I had problem with indexing documents some months ago as well. I found that there were XML control characters in the documents and these were not handled by Solr. Maybe it is the case for you as well. Regards, Georg On Sun, Mar 21, 2010 at 5:58 PM, Ross wrote: Hi all I'm trying to import some text files. I'm mostly following Avi Rappoport's tutorial. Some of my files cause Solr to crash while indexing. I've narrowed it down to a very simple example. I have a file named test.txt with one line. That line is the word XXBLE and nothing else This is the command I'm using. curl " http://localhost:8080/solr-example/update/extract?literal.id=1&commit=true " -F "myfi...@test.txt" The result is pasted below. Other files work just fine. The problem seems to be related to the letters B and E. If I change them to something else or make them lower case then it works. In my real files, the XX is something else but the result is the same. It's a common word in the files. I guess for this "quick and dirty" job I'm doing I could do a bulk replace in the files to make it lower case. Is there any workaround for this? Thanks Ross Apache Tomcat/6.0.20 - Error report HTTP Status 500 - org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba org.apache.solr.common.SolrException: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba at org .apache .solr .handler .extraction .ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211) at org .apache .solr .handler .ContentStreamHandlerBase .handleRequestBody(ContentStreamHandlerBase.java:54) at org .apache .solr .handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 131) at org.apache.solr.core.RequestHandlers $LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org .apache .solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java: 338) at org .apache .solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 241) at org .apache .catalina .core .ApplicationFilterChain .internalDoFilter(ApplicationFilterChain.java:235) at org .apache .catalina .core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java: 206) at org .apache .catalina .core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org .apache .catalina .core.StandardContextValve.invoke(StandardContextValve.java:191) at org .apache .catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org .apache .catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org .apache .catalina.core.StandardEngineValve.invoke(StandardEngineValve.java: 109) at org .apache .catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org .apache.coyote.http11.Http11Processor.process(Http11Processor.java: 849) at org.apache.coyote.http11.Http11Protocol $Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java: 454) at java.lang.Thread.run(Thread.java:636) Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException fr
Re: Solr crashing while extracting from very simple text file
Thanks Georg I don't think it's that because it crashes on a one word test file I create using the nano editor. I don't think nano is adding anything extra. My real files are created by a Windows utility called pdftotext. I solved the problem by getting pdftotext to generate html files rather than plain text. It just adds an html header and wraps everything in a tag. That seems to keep Solr happy. Ross On Mon, Mar 22, 2010 at 9:08 AM, György Frivolt wrote: > Hi, > > I had problem with indexing documents some months ago as well. I found > that there were XML control characters in the documents and these were not > handled by Solr. Maybe it is the case for you as well. > > Regards, > > Georg > > > On Sun, Mar 21, 2010 at 5:58 PM, Ross wrote: > >> Hi all >> >> I'm trying to import some text files. I'm mostly following Avi >> Rappoport's tutorial. Some of my files cause Solr to crash while >> indexing. I've narrowed it down to a very simple example. >> >> I have a file named test.txt with one line. That line is the word >> XXBLE and nothing else >> >> This is the command I'm using. >> >> curl " >> http://localhost:8080/solr-example/update/extract?literal.id=1&commit=true >> " >> -F "myfi...@test.txt" >> >> The result is pasted below. Other files work just fine. The problem >> seems to be related to the letters B and E. If I change them to >> something else or make them lower case then it works. In my real >> files, the XX is something else but the result is the same. It's a >> common word in the files. I guess for this "quick and dirty" job I'm >> doing I could do a bulk replace in the files to make it lower case. >> >> Is there any workaround for this? >> >> Thanks >> Ross >> >> Apache Tomcat/6.0.20 - Error >> report HTTP Status 500 - >> org.apache.tika.exception.TikaException: Unexpected RuntimeException >> from org.apache.tika.parser.txt.txtpar...@19ccba >> >> org.apache.solr.common.SolrException: >> org.apache.tika.exception.TikaException: Unexpected RuntimeException >> from org.apache.tika.parser.txt.txtpar...@19ccba >> at >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211) >> at >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) >> at >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) >> at >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >> at >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >> at >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >> at >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) >> at >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >> at >> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >> at >> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) >> at >> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) >> at >> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) >> at >> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) >> at java.lang.Thread.run(Thread.java:636) >> Caused by: org.apache.tika.exception.TikaException: Unexpected >> RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) >> at >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105) >> at >> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190) >> .
Re: Multi Select Facets through Java API
bump - anyone? -- View this message in context: http://old.nabble.com/Multi-Select-Facets-through-Java-API-tp27951014p27986301.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr crashing while extracting from very simple text file
Hi, I had problem with indexing documents some months ago as well. I found that there were XML control characters in the documents and these were not handled by Solr. Maybe it is the case for you as well. Regards, Georg On Sun, Mar 21, 2010 at 5:58 PM, Ross wrote: > Hi all > > I'm trying to import some text files. I'm mostly following Avi > Rappoport's tutorial. Some of my files cause Solr to crash while > indexing. I've narrowed it down to a very simple example. > > I have a file named test.txt with one line. That line is the word > XXBLE and nothing else > > This is the command I'm using. > > curl " > http://localhost:8080/solr-example/update/extract?literal.id=1&commit=true > " > -F "myfi...@test.txt" > > The result is pasted below. Other files work just fine. The problem > seems to be related to the letters B and E. If I change them to > something else or make them lower case then it works. In my real > files, the XX is something else but the result is the same. It's a > common word in the files. I guess for this "quick and dirty" job I'm > doing I could do a bulk replace in the files to make it lower case. > > Is there any workaround for this? > > Thanks > Ross > > Apache Tomcat/6.0.20 - Error > report HTTP Status 500 - > org.apache.tika.exception.TikaException: Unexpected RuntimeException > from org.apache.tika.parser.txt.txtpar...@19ccba > > org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: Unexpected RuntimeException > from org.apache.tika.parser.txt.txtpar...@19ccba >at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211) >at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) >at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) >at > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233) >at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) >at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) >at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) >at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) >at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) >at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) >at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) >at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) >at java.lang.Thread.run(Thread.java:636) > Caused by: org.apache.tika.exception.TikaException: Unexpected > RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba >at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121) >at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105) >at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190) >... 18 more > Caused by: java.lang.NullPointerException >at java.io.Reader.(Reader.java:78) >at java.io.BufferedReader. (BufferedReader.java:93) >at java.io.BufferedReader. (BufferedReader.java:108) >at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59) >at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119) >... 20 more > type Status > reportmessage > org.apache.tika.exception.TikaException: Unexpected > RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba > > org.ap
Re: MLT question
> My question is how can I paginate the results of this query? For example > instead of setting rows you must specify mlt.count in the params. But how > can I set the offset? mlt.offset? As you do in a not mlt search request, setting start param should paginate your response results blargy wrote: > > Im playing around with MLT and I am getting back decent results when > searching against a particular document. > > My question is how can I paginate the results of this query? For example > instead of setting rows you must specify mlt.count in the params. But how > can I set the offset? mlt.offset? > > Thanks > -- View this message in context: http://old.nabble.com/MLT-question-tp27973301p27985830.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: distributed solr and tf-idf
Pooja Verlani wrote: Hi, How good is the distributed solr shards tf-idf (If at all its working with solr 1.4) ? Is there a chance of it getting better. I have to implement a huge index with many shards. How is it possible to get a global tf-idf for the same, any ideas? Regards, Pooja Distributed idf is not supported 1.4. There is a patch: https://issues.apache.org/jira/browse/SOLR-1632 Koji -- http://www.rondhuit.com/en/
Index field untokenized
Hi All, I want to index some data untokenized (e.g. url), but I can't find a way to do it. I know there is a way to do it in solr configuration but I want to specify this options directly in my solr xml. This is a fragment of the xml that i post in slr and I want to know if is possible to add to some field (e.g. modsCollection.name.xlink:href) an extra attribute in some other way the information about how to index it.// /// http://www.fao.org/faooa/schemas/eims/v0.9"; xmlns:mods="http://www.loc.gov/mods/v3"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:eims="http://www.fao.org/faooa/schemas/eims/v0.9"; xmlns:xlink="http://www.w3.org/1999/xlink"; xmlns:xalan="http://xml.apache.org/xalan"; xmlns:l="http://lang.data"; xmlns:fn="http://www.w3.org/2005/xpath-functions"; xmlns:dcterms="http://purl.org/dc/terms/"; xmlns:ags="http://www.fao.org/agris/agmes/schemas/0.1/"; xmlns:uvalibadmin="http://dl.lib.virginia.edu/bin/admin/admin.dtd/"; xmlns:uvalibdesc="http://dl.lib.virginia.edu/bin/dtd/descmeta/descmeta.dtd"; xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"; xmlns:dc="http://purl.org/dc/elements/1.1/"; xmlns:foxml="info:fedora/fedora-system:def/foxml#" xmlns:zs="http://www.loc.gov/zing/srw/";> eims-document:1960 . http://aims.fao.org/aos/v01/corporatebody/c_1962 iso639-2b /Regards, Alessandro http://www.fao.org/faooa/schemas/eims/v0.9"; xmlns:mods="http://www.loc.gov/mods/v3"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xmlns:eims="http://www.fao.org/faooa/schemas/eims/v0.9"; xmlns:xlink="http://www.w3.org/1999/xlink"; xmlns:xalan="http://xml.apache.org/xalan"; xmlns:l="http://lang.data"; xmlns:fn="http://www.w3.org/2005/xpath-functions"; xmlns:dcterms="http://purl.org/dc/terms/"; xmlns:ags="http://www.fao.org/agris/agmes/schemas/0.1/"; xmlns:uvalibadmin="http://dl.lib.virginia.edu/bin/admin/admin.dtd/"; xmlns:uvalibdesc="http://dl.lib.virginia.edu/bin/dtd/descmeta/descmeta.dtd"; xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"; xmlns:dc="http://purl.org/dc/elements/1.1/"; xmlns:foxml="info:fedora/fedora-system:def/foxml#" xmlns:zs="http://www.loc.gov/zing/srw/";> eims-document:1960 Active Note relative à la réforme de l'ONU et de la FAO 2010-03-11T13:37:44.537Z 2010-03-11T13:39:15.819Z 2 AUDREC1 Fedora API-M modifyDatastreamByValue DC fedoraAdmin 2010-03-11T13:37:44.801Z Initial Import of this Object AUDREC2 Fedora API-M addDatastream MODS fedoraAdmin 2010-03-11T13:39:09.348Z AUDREC3 Fedora API-M addDatastream AGRISFO fedoraAdmin 2010-03-11T13:39:11.931Z AUDREC4 Fedora API-M addDatastream EIMS fedoraAdmin 2010-03-11T13:39:13.434Z AUDREC5 Fedora API-M addDatastream SKOS fedoraAdmin 2010-03-11T13:39:15.819Z fr Note relative à la réforme de l'ONU et de la FAO pubid.fao.org:210159 FAO info:fedora/eims-document:1960 faooa:FRBR-EXPRESSION J8010 3.3 2006-06-29 fr Note relative à la réforme de l'ONU et de la FAO fao-aos-corporatebody corporate http://aims.fao.org/aos/v01/corporatebody/c_1962 en FAO, Rome (Italy). Fisheries and Aquaculture Dept. marcrelator text Author marcrelator text conference en FAO Committee on Fisheries. Sub-Committee on Aquaculture (Sess. 4 : 6-10 Oct 2008 : Puerto Varas, Chile) marcrelator text Author marcrelator text type Conference type type Non-conventional type iso639-2b code fra iso639-2b code text French text jn J8010 jn rn 210159 0 3 en KC 1 en Publication
distributed solr and tf-idf
Hi, How good is the distributed solr shards tf-idf (If at all its working with solr 1.4) ? Is there a chance of it getting better. I have to implement a huge index with many shards. How is it possible to get a global tf-idf for the same, any ideas? Regards, Pooja