I've been thinking about the next major version of Solr.
Here's some brainstorming on goals/ideas:
 - use a standard IOC container for externalization of configuration
and plugins... Spring "springs" to mind as the obvious choice here.
May want to use other spring services such as JMX integration, and
look into JMX management (more than just statistics), etc.
 - support programatic construction and manipulation of IndexSchema, etc.
 - support some sort of standard RPC mechanism (Thrift, Etch, ???) so
strongly typed language bindings don't have to be developed for every
language (as it seems most people want).  Create an IDL for common
operations and then use a compiler to create the stubs for perl,
python, java, etc.
     - an RPC mechanism that can have multiple operations pending per
socket (and maybe use NIO) would probably be good for distributed
search, etc.
 - allow more lower level index operations... create a new index at a
given spot, merge multiple indicies, etc.
 - make Solr more scalable and cloud computing friendly... make it
easier to create and deploy clusters/shards, as well as change the
size of clusters
     - remove the single-master points of failure per-shard (support
or incorporate something like bailey)
     - make it easier to deploy config changes (possibly use
zookeeper... prob want that for cluster management anyway)
     - since solr will have the data, possibly allow plugins that
could do map-reduce, or other interfaces that enable things like
mahout.
 - support more changes w/o manual re-indexing... change the schema
and have Solr re-index in the background (assuming all data is
available via stored fields or elsewhere via a plugin)
 - support more "realtime" search... greatly reducing or eliminating
the lag between adding a document and making it searchable
 - support "tagging" type of updates... quickly updating part of a
document, or data associated with a document
 - try to expose more lower-level Lucene functionality to better
support other projects that want to embed Solr (IOC should hopefully
make Solr easier to embed and customize too)

To support some of these goals, some re-architecture is probably in
the cards.  Caching based on the IndexReader rather than the
IndexSearcher is probably one necessary change.  We should also use
this as an opportunity to clean some things up and improve the core
architecture since this will be a major version change.  But we should
also
 - continue to support the current main solr web interfaces for
searching and update
 - retain (or improve) the ease of use factor
    - we should always be able to point at an existing Lucene index
and do interesting things with it
    - continue to focus on single-node ease of use for small web developers

As for the future of Solr 1.x, I fully expect a Solr 1.4 release as
well as other 1.x releases after that.

Possible next steps:
  - Have discussions on solr-dev with a subject prefix of "solr2:"
  - We should avoid the temptation to start banging out code (unless
it's just example code) and take some time to really leverage all of
the architectural experience this larger solr-dev community brings.
  - Establish a wiki section for solr2 to capture current consensus...
but generally use solr-dev for ideas and establishing that consensus
  - let java-dev know about this (i.e. what in Solr didn't suit their
needs and how can we change that)

Onward and upward... Other thoughts & ideas?

-Yonik

Reply via email to