Re: Solr(j) API for manipulating the schema(.xml)?
Basically you could create a bunch of dynamic fields (according to your needs) so basically creating a dynamic field for each type of data (and several combinations) and then you can create a small wrapper around Solrj that will wrap the patterns defined on your schema.xml in a more understandable way. Like this you will be able to abstract the manipulation of the schema.xml file and only introduce it when is really needed i.e a new field type with new analyzers, etc. On Sep 18, 2014, at 3:16 AM, Clemens Wyss DEV wrote: > as our framework so far only knows a few field types "dynamic field"s may be > the way to go... And if there are new fieldtypes the new schema can be > distributed through ZooKeeper > > -Ursprüngliche Nachricht- > Von: Erick Erickson [mailto:erickerick...@gmail.com] > Gesendet: Mittwoch, 17. September 2014 19:56 > An: solr-user@lucene.apache.org > Betreff: Re: Solr(j) API for manipulating the schema(.xml)? > > Right, you can create new cores over the rest api. > > As far as changing the schema, there's no good way to do that that I know of > programmatically. In the SolrCloud world, you can upload the schema to > ZooKeeper and have it automatically distributed to all the nodes though. > > Best, > Erick > > On Wed, Sep 17, 2014 at 2:28 AM, Clemens Wyss DEV > wrote: >> Is there an API to manipulate/consolidate the schema(.xml) of a Solr-core? >> Through SolrJ? >> >> Context: >> We already have a generic indexing/searching framework (based on lucene) >> where any component can act as a so called IndexDataPorvider. This provider >> delivers the field-types and also the entities to be (converted into >> documents and then) indexed. Each of these IndexProviders has ist own lucene >> index. >> So we kind of have the information for the Solr schema.xml. >> >> Hope the intention is clear. And yes the manipulation of the schema.xml is >> basically only needed when the field types change. Thats why I am looking >> for a way to consolidate the schema.xml (upon boot, initialization oft he >> IndexDataProviders ...). >> In 99,999% it won't change, But I'd like to keep the possibility of an >> IndexDataProvider to hand in "its schema". >> >> Also, again driven by the dynamic nature of our framework, can I easily >> create new cores over Sorj or the Solr-REST API ? Concurso "Mi selfie por los 5". Detalles en http://justiciaparaloscinco.wordpress.com
[ANNOUNCE] Apache Gora 0.5 Release
Hi Folks, Apologies for cross posting. The Apache Gora team are pleased to announce the immediate availability of Apache Gora 0.5. The Apache Gora open source framework provides an in-memory data model and persistence for big data. Gora supports persisting to column stores, key value stores, document stores and RDBMSs, and analyzing the data with extensive Apache Hadoop™ MapReduce support. Gora uses the Apache Software License v2.0. This release addresses no fewer than 44 issues [0] with many being improvements and new functionality. Most notably the release includes the addition of a new module for MongoDB, Shim ffunctionality to support multiple Hadoop versions, improved authentication for Accumulo, better documentation for many modules, and pluggable solrj implementations supporting a default value of http for HttpSolrServer. Available options include http (HttpSolrServer), cloud (CloudSolrServer), concurrent (ConcurrentUpdateSolrServer) and loadbalance (LBHttpSolrServer). Suggested Gora database support is as follows - Apache Avro 1.7.6 - Apache Hadoop 1.0.1 and 2.4.0 - Apache HBase 0.94.14 - Apache Cassandra 2.0.2 - Apache Solr 4.8.1 - MongoDB 2.6 - Apache Accumlo 1.5.1 Gora is released as both source code, downloads for which can be found at our downloads page [1] as well as Maven artifacts which can be found on Maven central [2]. Thank you Lewis (on behalf of the Apache Gora PMC) [0] http://s.apache.org/0.5report [1] http://gora.apache.org/downloads.html [2] http://search.maven.org/#search|ga|1|gora -- ` : : , : #+`. ,,`, ` ;##` .`,. ;;':;` `` ##@.;.;: ,;+;;;';;';;';'` ```,###: .,;; +;;'';;+;;;';;` ```#+##'``;+ '';;;'';;';;;';;;` ```,##+#@:: ''';';;';+;;';;':::+: ```.#'';';+;;';';';;';;';;':,;: '#+#+#';';''';;';';;';;';':: ;;:';,##''';'';;';';;'';;;'::';;;':.``` `.,`;;;++';'';;';'';;';;;';;'::';;:;';;;:: :`,.,.`:';+#+;;''';'';';';;';;';;';;;'::;';:;. .`..;,:`';;';';;;'+#+';;''+';;';:'';;';';;;':::;,:` ` ,`:. ;;;';';;;++#+'';''';''+;;';;';::';';;:.. ` `` ;;;';';';';;'+###+';';'';;';;';;';;';;;';;',:. ` ` `;:;;';';';;;'+;';';;';;';;';;';;'';;';';::; `.;,:::;::;';';;'#++''';;';;;'';+';:::''::;;..: ```:,'::,;';';;;';;;''##+++'';;';;';;;''';;':,,,:.:,.` ```..::,;';:;';';';;;';';';';'''++###+'+;';;;';;;';;:;.:..:.., ,;;:;:;';''';''++##+++.:..:.,; ` `.``,,:,';;::;;::';';;;';';;';';;';';;';;';';';'++#+###@#++:...,,.;:. `:.';.,;;',,;;;';';;';;':;;;';';;';;';';';;';;;''.:,:.,:'#@'::, ```.:,';;.::':';';',;;;';;':;';;';;';;;';;';'';;.;.,.:..,:.:: ``:::',:;';;,:;;',:';';;':';';;;';;'::';;;,..,.,.,:+` `..:'+:';;',;';,:;:';;;,,';::,';;',,';;.:.:;, ``,.';;:':,;:;,,:;:::``..,:,`` :`;;` ``: ,:` http://people.apache.org/~lewismc || @hectorMcSpector || http://www.linkedin.com/in/lmcgibbney Apache Gora V.P || Apache Nutch PMC || Apache Any23 V.P || Apache OODT PMC || Apache Open Climate Workbench PMC || Apache Tika PMC || Apache TAC
Re: Help on custom sort
How many different groups are there? And can user A ever be part of more than one group? If 1> there are a reasonably small number of groups (< 100 or so as a place to start) and 2> a user is always part of a single group then you could store separate prices in each document by group, thus you'd have some fields like price_group_a: $100 price_group_b: $101 then sorting becomes trivial, you just specify a sort_group_a for users in group A etc. If the number of groups is unknown-but-not-huge dynamic fields could be used. If that's not the case, then you might be able to get clever with sorting by function, here's a place to start: https://cwiki.apache.org/confluence/display/solr/Function+Queries These can be arbitrarily complex, but I'm thinking something where the price returned by the function respects the group the user is in, perhaps even the min/max of all the groups the user is in. I admit I haven't really thought that through well though... Best, Erick On Sat, Sep 20, 2014 at 9:26 AM, Scott Smith wrote: > I need to provide a custom sort option for sorting by price and I would like > some suggestions. It's not the straightforward "just sort by a price field > in the document" scenario or I wouldn't be asking for help. Here's the > scenario I'm dealing with. > > I have 100 million+ documents (so multi-sharded). Users search for documents > they are interested in using a standard keyword search. They then purchase > documents they are interested in. So far, nothing hard. > > Here's where things get "interesting". The documents come from multiple > suppliers. Each supplier sets a price for his documents and different > suppliers will provide different pricing. > > That wouldn't be difficult except that *users* are divided up into different > groups and depending on which group they are in, the supplier will charge the > user a different price. So, user A may pay one price for a document and user > B may pay a different price for the same document just because user A and > user B are in different groups. I don't even know if the relative order or > pricing is the same between different groups (e.g., if document X is more > expensive than document Y for a user in group M, it may not be more expensive > for a user in group N). The one thing that may make this doable is that > supplier A will likely have the same price for all of his documents for each > of the user groups. So, a user in group A will pay the same price regardless > of which document he buys from supplier 1. A user in group B will also pay > the same price for any document from supplier 1; it's just that a user in > group B will likely pay a different price than a user in group A. So, within > a supplier, the price varies based on user group, not the document. > > To summarize, one of the requirements for the system is that we provide the > ability to sort search results based on price. This would be easy except > that the price a user pays not only depends on what he wants to buy, but on > what group the he is in. > > I suspect there is some kind of custom solr module I'm going to have to > write. I'm thinking that the user group gets passed in as a custom solr > parameter (I'm assuming that's possible??). Then I'm thinking that there has > to be some kind of in memory database that tracks pricing based on user group > and document supplier). > > I'm happy to go read code, documents, links, etc if someone can point me in > the right direction. What kind of solr module am I likely going to write > (extend) and are there some examples somewhere? Maybe there's a way to do > this without having to extend a solr module?? > > Hope this makes sense. Any help is appreciated. > > Scott > >
Re: Will commit/softcommit invalid filtercache?
This should help, about 1/3 of the way down are the answers to your specific questions... https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Best, Erick On Sat, Sep 20, 2014 at 2:38 AM, forest_soup wrote: > Hi, all. > > We have some questions of commit/softcommit and cache. > We understand that a softcommit will create a new searcher. Will the > filtercache be invalid after a softcommit is done? > > And also for commit, if we do commit with openSearcher, will the filtercache > be invalid? > > Thanks! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Will-commit-softcommit-invalid-filtercache-tp4160153.html > Sent from the Solr - User mailing list archive at Nabble.com.
Help on custom sort
I need to provide a custom sort option for sorting by price and I would like some suggestions. It's not the straightforward "just sort by a price field in the document" scenario or I wouldn't be asking for help. Here's the scenario I'm dealing with. I have 100 million+ documents (so multi-sharded). Users search for documents they are interested in using a standard keyword search. They then purchase documents they are interested in. So far, nothing hard. Here's where things get "interesting". The documents come from multiple suppliers. Each supplier sets a price for his documents and different suppliers will provide different pricing. That wouldn't be difficult except that *users* are divided up into different groups and depending on which group they are in, the supplier will charge the user a different price. So, user A may pay one price for a document and user B may pay a different price for the same document just because user A and user B are in different groups. I don't even know if the relative order or pricing is the same between different groups (e.g., if document X is more expensive than document Y for a user in group M, it may not be more expensive for a user in group N). The one thing that may make this doable is that supplier A will likely have the same price for all of his documents for each of the user groups. So, a user in group A will pay the same price regardless of which document he buys from supplier 1. A user in group B will also pay the same price for any document from supplier 1; it's just that a user in group B will likely pay a different price than a user in group A. So, within a supplier, the price varies based on user group, not the document. To summarize, one of the requirements for the system is that we provide the ability to sort search results based on price. This would be easy except that the price a user pays not only depends on what he wants to buy, but on what group the he is in. I suspect there is some kind of custom solr module I'm going to have to write. I'm thinking that the user group gets passed in as a custom solr parameter (I'm assuming that's possible??). Then I'm thinking that there has to be some kind of in memory database that tracks pricing based on user group and document supplier). I'm happy to go read code, documents, links, etc if someone can point me in the right direction. What kind of solr module am I likely going to write (extend) and are there some examples somewhere? Maybe there's a way to do this without having to extend a solr module?? Hope this makes sense. Any help is appreciated. Scott
Re: Error Instantiating UpdateRequestProcessorFactory
I’ve found the issue. - First, my IDE was putting all the Solr JAR dependencies into my custom JAR. I noticed the JAR was 14MB when it should have been a few Kb. I changed this to get a JAR with only my classes in. - I then ran into CNFEs of the Solr UpdateRequestProcessorFactory and UpdateRequestProcessor classes. This was because I was adding my JAR to Tomcat’s lib folder where they are loaded before the solr web app’s libs, so it was not finding the dependencies. By moving my JAR into the solr web app WEB-INF/lib this issue is resolved. Cheers On 20 Sep 2014, at 05:30, Shalin Shekhar Mangar wrote: > Sounds like a class loader issue. Try adding your jar to $SOLR_HOME/lib > instead of tomcat lib. > > Also, upgrade to Solr 4.x, 3.6 is ancient! :) > > On Sat, Sep 20, 2014 at 1:13 AM, Allistair C wrote: > >> Hi all, >> >> I’m in a bit of a cul de sac with an issue, hope you can help. >> >> I am creating a custom UpdateRequestProcessor. The Solr documentation >> details that I need to write a factory class subclassing >> UpdateRequestProcessorFactory and this should return an instance of my >> class that subclasses UpdateRequestProcessor. >> >> I have done this, and I have created a JAR. >> >> I have deployed the JAR into Tomcat’s lib folder where Solr is running. >> >> I have modified the solrconfig to include my class correctly. >> >> On startup Solr finds my class but does not believe it conforms to being a >> UpdateRequestProcessorFactory. >> >> SEVERE: org.apache.solr.common.SolrException: Error Instantiating >> UpdateRequestProcessorFactory, >> com.acme.solr.update.processor.URLRewriteUpdateRequestProcessorFactory is >> not a org.apache.solr.update.processor.UpdateRequestProcessorFactory >>at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:421) >> >> Things I have tried: >> >> - Ensured that I am compiling my JAR with the exact JDK that is running >> Solr. >> - Downloaded Solr 3.6.2 and copied one of the Solr built-in processors, >> renamed it, compiled it and tried to use it - SAME issue. >> - Created a test that uses the same code as the Solr code that is failing >> (namely, isAssignableFrom): >> >>Class clazz = >> Class.forName("com.acme.solr.update.processor.URLRewriteUpdateRequestProcessorFactory"); >>boolean isA = >> UpdateRequestProcessorFactory.class.isAssignableFrom(clazz); >>System.out.println(isA); >> >> Print’s “true” - i.e. it’s perfectly OK! >> >> I include my simple processor here: >> >> package com.acme.solr.update.processor; >> >> public class URLRewriteUpdateRequestProcessorFactory extends >> UpdateRequestProcessorFactory >> { >>@Override >>public UpdateRequestProcessor getInstance(SolrQueryRequest req, >> SolrQueryResponse rsp, UpdateRequestProcessor next) { >>return new URLRewriteProcessor(next); >>} >> } >> >> class URLRewriteProcessor extends UpdateRequestProcessor >> { >>public URLRewriteProcessor(UpdateRequestProcessor next) >>{ >>super(next); >>} >> >>@Override >>public void processAdd(AddUpdateCommand cmd) throws IOException >>{ >>SolrInputDocument doc = cmd.getSolrInputDocument(); >>doc.setField("foo", "bar"); >> >>super.processAdd(cmd); >>} >> } >> >> At this point I am at a loss and would appreciate any assistance or ideas >> to try. >> >> Cheers > > > > > -- > Regards, > Shalin Shekhar Mangar.
Will commit/softcommit invalid filtercache?
Hi, all. We have some questions of commit/softcommit and cache. We understand that a softcommit will create a new searcher. Will the filtercache be invalid after a softcommit is done? And also for commit, if we do commit with openSearcher, will the filtercache be invalid? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Will-commit-softcommit-invalid-filtercache-tp4160153.html Sent from the Solr - User mailing list archive at Nabble.com.