Hello Ryan, > SQL database such as H2
Mainly to offer joins and be able to perform hierarchical queries. Also any other types of queries a hybrid SQL search system would offer. This is something that is best built into SOLR rather than Lucene. It seems like a lot of the users of SOLR work with SQL databases as well. It would seem natural to integrate the two. Also the Summize realtime search system that Twitter purchased worked by integrating with Mysql. The way to do something similar in Lucene would be to integrate with a Java SQL database. Also hierarchical queries could be performed faster using this method (though I could be wrong, if there is a better way). > to have multiple lucene indexes within a single SolrCore? I don't like the whole multi core thing from an administrative perspective. That means each index needs a separate schema and configuration etc. That becomes hard to manage if there are 10+ indexes required and is definitely not as simple as an SQL database does not require so many separate directories and manual configuration. It would be simple to add this into SOLR. In general though I have trouble figuring out many of the design decisions of SOLR though and so hesitate to implement things that seem to go against the SOLR design model (is there one?). > 9. Distributed search and updates using a object serialization which Where would I start with integrating this into SOLR? Need some help on that part of it. Tell me what's best and I'll integrate it, it should be the easiest on the list. Jason On Mon, Sep 15, 2008 at 11:44 AM, Ryan McKinley <[EMAIL PROTECTED]> wrote: >> > > Here are my gut reactions to this list... in general, most of this comes > down to "sounds great, if someone did the work I'm all for it"! > > Also, no need to post to solr-user AND solr-dev, probably better to think of > solr-user as a superset of solr-dev. > > >> 1. Machine learning based suggest feature >> https://issues.apache.org/jira/browse/LUCENE-626 which is implemented >> as is similar to what Google in their suggest implementation. The >> Fuzzy based spellchecker is ok, but it would be better to incorporate >> use behavior. >> 2. Realtime updates https://issues.apache.org/jira/browse/LUCENE-1313 >> and work being planned for IndexWriter >> 3. Realtime untokenized field updates >> https://issues.apache.org/jira/browse/LUCENE-1292 > > Without knowing the details of these patches, everything sounds great. > > In my view, SOLR should offer a nice interface to anything in lucene > core/contrib > >> >> 4. BM25 Scoring > > Again, no idea, but if implement in lucene yes > >> >> 5. Integration with an open source SQL database such as H2. This >> would mean under the hood, SOLR would enable storing data in a >> relational database to allow for joins and things. It would need to >> be combined with realtime updates. H2 has Lucene integration but it >> is the usual index everything at once, non-incrementally. The new >> system would simply index as a new row in a table is added. The SOLR >> schema could allow for certain fields being stored in an SQL database. > > Sounds interesting -- what is the basic problem you are addressing? > > (It seems you are pointing to something specific, and describing your > solution) > > >> >> 6. SOLR schema allowing for multiple indexes without using the >> multicore. The indexes could be defined like SQL tables in the >> schema.xml file. > > Is this just a configuration issue? I defiantly hope we can make > configuration easier in the future. > > As is, a custom handler can look at multiple indexes... why is their a need > to have multiple lucene indexes within a single SolrCore? > > >> >> 6. Crowd by feature ala GBase >> http://code.google.com/apis/base/attrs-queries.html#crowding which is >> similar to Field Collapsing. I am thinking it is advantageous from a >> performance perspective to obtain an excessive amount of results, then >> filter down the result set, rather than first sort a result set. > > Again, sounds great! I would love to see it. > >> >> 7. Improved relevance based on user clicks of individual query results >> for individual queries. This can be thought of as similar to what >> Digg does. I'm sure Google does something similar. It is a feature >> that would be of value to almost any SOLR implementation. > > Agreed -- if there is a good way to quickly update a field used for > sorting/scoring, this would happen > >> >> 8. Integration of LocalSolr into the standard SOLR distribution. >> Location is something many sites use these days and is standard in >> GBase and most likely other products like FAST. > > I'm working on it.... will be a lucene contrib package and cooked into the > core solr distribution. > > >> >> 9. Distributed search and updates using a object serialization which >> could use. https://issues.apache.org/jira/browse/LUCENE-1336 This >> allows span queries, custom payload queries, custom similarities, >> custom analyzers, without compiling and deploying and a new SOLR war >> file to individual servers. > > > sounds good (but I have no technical basis to say so) > > > ryan > >