Hi Thomas, > A question that remains is this, is it better to use the core Lucene API in my local client > for the work it does locally with indexes or is it okay to use embedded Solr with SolrJ?
Thats a very good question. Hopefully experts could answer this for us. I will use SolrJ instead Lucene because of [1] and because I think the explanation of [2] is a bit misleading, because it only means the EmbeddedServer part of SolrJ is deprecated and not the whole API e.g. via CommonsHttpSolrServer <http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/CommonsHttpSolrServer.html>. But I do not know this for sure. Regards, Peter. [1] http://stackoverflow.com/questions/2856427/situations-to-prefer-apache-lucene-over-solr [2] http://wiki.apache.org/solr/EmbeddedSolr > While Solr is optimized for the server aspects I'm not sure if it is the best > option for the client side of things? > > Thom > > > On 2010-05-23, at 7:36 AM, Peter Karich wrote: > > >> Hi, >> >> just as a side note as I did not read the link in your conversation: >> >> http://wiki.apache.org/lucene-java/NearRealtimeSearch (I just stumbled >> over this as I am interested in this feature too) >> >> Regards, >> Peter. >> >> >>> Thanks for the new information. Its really great to see so many options for >>> Lucene. >>> >>> In my scenario there are the following pieces: >>> >>> 1 - A local Java client with an embedded Solr instance and its own local >>> index/s. >>> 2 - A remote server running Solr with index/s that are more like a >>> repository that local clients query for extra goodies. >>> 3 - The client is also a JXTA node so it can share indexes or documents too. >>> 4 - There is no browser involved what so ever. >>> >>> My music composing application is a local client that uses configurations >>> which would become many different document types. A subset of these >>> configurations will be bundled with the application and then many more >>> would be made available via a server/s running Solr. >>> >>> I would not expect the queries which would be made from within the local >>> client to be returned in real-time. I would only expect such queries to be >>> made in reasonable time and returned to the client. The client would have >>> its local Lucene index system (embedded Solr using SolrJ) which would be >>> updated with the results of the query made to the Solr instance running on >>> the remote server. >>> >>> Then the user on the client would issue queries to the local Lucene index/s >>> to obtain results which are used to setup contexts for different aspects of >>> the client. For example: an activated context for musical scales and >>> rhythms used for creating musical notes, an activated context for rendering >>> with layout and style information for different music symbol renderer types. >>> >>> I'm not yet sure but it may be best to make queries against the local >>> Lucene index/s and then convert the results into some context objects, >>> maybe an array or map (I'd like to learn more about how query results can >>> be returned as arrays or maps as well). Then the tools and renderers which >>> require the information in the contexts would do any real-time lookup >>> directly from the context objects not the local or remote Lucene or Solr >>> index/s. The local client is also a JXTA node so it can share its own >>> index/s with fellow peers. >>> >>> This is how I envision this happening with my limited knowledge of >>> Lucene/Solr at this time. What are your thoughts on the feasibility of such >>> a scenario? >>> >>> I'm just reading through the Solr reference PDF now and looking over the >>> Solr admin application. Looking at the Schema.xml it seems to be field not >>> document oriented. From my point of view I think in terms of configuration >>> types which would be documents. In the schema it seems like only fields are >>> defined and it does not matter which configuration/document they belong to? >>> I guess this is fine as long as the indexing takes into account my unique >>> document types and I can search for them as a whole as well, not only for >>> specific values across a set of indexed documents. >>> >>> Also, does the schema allow me to index certain documents into specific >>> indexes or are they all just bunched together? I'd rather have unique >>> indexes for specific document types. I've just read about multiple cores >>> running under one Solr instance, is this the only way to support multiple >>> indexes? >>> >>> I'm thinking of ordering the Lucene in Action v2 book which is due this >>> month and also the Solr 1.4 book. Before I do I just need to understand a >>> few things which is why I'm writing such a long message :-) >>> >>> Thom >>> >>> >>> On 2010-05-21, at 2:12 AM, Ben Eliott wrote: >>> >>> >>> >>>> Further to earlier note re Lucandra. I note that Cassandra, which >>>> Lucandra backs onto, is 'eventually consistent', so given your real-time >>>> requirements, you may want to review this in the first instance, if >>>> Lucandra is of interest. >>>> >>>> On 21 May 2010, at 06:12, Walter Underwood wrote: >>>> >>>> >>>> >>>>> Solr is a very good engine, but it is not real-time. You can turn off the >>>>> caches and reduce the delays, but it is fundamentally not real-time. >>>>> >>>>> I work at MarkLogic, and we have a real-time transactional search engine >>>>> (and respository). If you are curious, contact me directly. >>>>> >>>>> I do like Solr for lots of applications -- I chose it when I was at >>>>> Netflix. >>>>> >>>>> wunder >>>>> >>>>> On May 20, 2010, at 7:22 PM, Thomas J. Buhr wrote: >>>>> >>>>> >>>>> >>>>>> Hello Soir, >>>>>> >>>>>> Soir looks like an excellent API and its nice to have a tutorial that >>>>>> makes it easy to discover the basics of what Soir does, I'm impressed. I >>>>>> can see plenty of potential uses of Soir/Lucene and I'm interested now >>>>>> in just how real-time the queries made to an index can be? >>>>>> >>>>>> For example, in my application I have time ordered data being processed >>>>>> by a paint method in real-time. Each piece of data is identified and its >>>>>> associated renderer is invoked. The Java2D renderer would then lookup >>>>>> any layout and style values it requires to render the current data it >>>>>> has received from the layout and style indexes. What I'm wondering is if >>>>>> this lookup which would be a Lucene search will be fast enough? >>>>>> >>>>>> Would it be best to make Lucene queries for the relevant layout and >>>>>> style values required by the renderers ahead of rendering time and have >>>>>> the query results placed into the most performant collection (map/array) >>>>>> so renderer lookup would be as fast as possible? Or can Lucene handle >>>>>> many individual lookup queries fast enough so rendering is quick? >>>>>> >>>>>> Best regards from Canada, >>>>>> >>>>>> Thom >>>>>> >> >> > > -- Free your timetabling! http://timefinder.sourceforge.net/