Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Yes faceting works as before. Regarding the cache, the suggestion is to 
disable the cache for realtime NRT, for now.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/24/2012 2:57 PM, Andy wrote:

Nagendra,

Does RankingAlgorithm work with faceting which requires the use of cache? As 
new documents are added or updated, the cache will be constantly invalidated. 
So how would RankingAlgorithm work in this case?



  From: Nagendra Nagarajayyannagaraja...@transaxtions.com
To: solr-user@lucene.apache.org
Sent: Tuesday, July 24, 2012 8:24 AM
Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with 
Realtime NRT available for download

Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:

On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com   wrote:

Realtime NRT algorithm enables NRT functionality in
Solr by not closing the Searcher object  and so is very fast. I am in the
process of contributing the algorithm back to Apache Solr as a patch.

Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.

Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher object. All 
it does is override the IndexSearcher.getIndexReader() method so as to supply a NRTReader 
if realtime is enabled. All direct references to the reader member has been 
replaced with a getIndexReader() method access.

The performance is better as SolrIndexSearcher is not closed every 1 sec as in 
soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is 
reference counted. So every 1 sec this object needs to closed, re-allocated and 
the indexes need to be re-opened, caches invalidated, while waiting for 
existing searchers to complete, making this very expensive. realtime NRT does 
not close the SolrIndexSearcher object but makes available a new NRTReader with 
document updates ie. getIndexReader() returns a new NRTReader.


There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!

The reader member is not replaced in the existing SolrIndexSearcher object. The 
IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher 
and all direct reader member access has been replaced with a getIndexReader() 
method call allowing a NRT reader to be supplied when realtime is enabled. The 
concurrency is handled by the getNRTReader() method, with the static index view 
now increased to the granularity provided by the NRTIndexReader.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


-Yonik
http://lucidimagination.com






Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya
Each request thread may return updated results.  Each component may also 
in certain cases return updated results. The algorithm is designed to 
handle these. The granularity of the returned results can be controlled 
through a visible parameter.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

On 7/24/2012 5:36 AM, Yonik Seeley wrote:

On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com  wrote:

SolrIndexSearcher is a heavy object with caches, etc.

As I've said, the caches are configurable, and it's trivial to disable
all caching (to the point where the cache objects are not even
created).


The reader member is not replaced in the existing SolrIndexSearcher object.
The IndexSearcher.getIndexReader() method has been overriden in
SolrIndexSearcher and all direct reader member access has been replaced with
a getIndexReader() method call allowing a NRT reader to be supplied when
realtime is enabled.

In a single Solr request (that runs through multiple components like
query, highlight, facet, and response writing),
does IndexSearcher.getIndexReader() always return the same reader?  If
not, this breaks pretty much every standard solr component - but it
will only be apparent under load, and if you are carefully sanity
checking the results.

-Yonik
http://lucidimagination.com






Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
You are changing the name, or someone at Apache told you the current name is 
okay?

If someone at Apache told you it was okay, who was that?

You are certainly not using the Solr mark in an approved manner and I'd hope if 
you are going to take advantage of our mailing list for promotion of your 
product, that you would not violate our trademark. You are already on shaky 
ground promoting a Solr fork on the Solr mailing list by announcing every 
release - naming your fork something with Solr in it puts you over the edge on 
my list.

We don't allow people to name their products things like Solr: the wonder 
edition or anything along those lines. Solr is our trademark and third party 
products must have their own name. The only thing we allow is the phrase 
powered by Solr.

I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty hard 
to believe that anyone would suggest that your usage is a correct usage of the 
Solr trademark.

- Mark

On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:

 Thanks Mark! I am already working with Apache Software Foundation on the mark 
 and am using the correct usage of the mark as suggested by them.
 
 Regards,
 
 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org
 
 
 On 7/23/2012 12:15 PM, Mark Miller wrote:
 On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:
 
 I am not sure why any one will get offended by an announcement that NRT 
 functionality was available with older releases.
 FWIW, I'm not offended - I don't mind if third parties post announcements if 
 they are related to Solr.
 
 I just want to make sure it's very clear that it's a third party announce so 
 there is no confusion - people that don't follow the lists on a daily basis 
 read these things. A lot of these emails end up archived on various sites 
 that collect mailing lists. It's easy to run into them without the proper 
 context.
 
 I think part of the confusion is the naming. Technically, Apache does not 
 allow the use of Apache marks as part of a third party name. Instead, the 
 name should be something like Product X, powered by Solr
 
 See http://www.apache.org/foundation/marks/faq/#products
 
 - Mark Miller
 lucidimagination.com
 
 
 
 
 
 
 
 
 
 
 
 
 
 

- Mark Miller
lucidimagination.com













Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Andy
But Solr relies on cache in faceting for performance reason. If it is required 
to disable the cache then faceting would be very slow under RankingAlgorithm, 
no?



 From:Nagendra Nagarajayya nnagaraja...@transaxtions.com
To:solr-user@lucene.apache.org 
Sent:Wednesday, July 25, 2012 9:12 AM
Subject:Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime 
NRT available for download
 
Yes faceting works as before. Regarding the cache, the suggestion is to 
disable the cache for realtime NRT, for now.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/24/2012 2:57 PM, Andy wrote:
 Nagendra,

 Does RankingAlgorithm work with faceting which requires the use of cache? As 
 new documents are added or updated, the cache will be constantly invalidated. 
 So how would RankingAlgorithm work in this case?


 
   From: Nagendra Nagarajayyannagaraja...@transaxtions.com
 To: solr-user@lucene.apache.org
 Sent: Tuesday, July 24, 2012 8:24 AM
 Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with 
 Realtime NRT available for download

 Hi Yonik:

 Please see my comments below:

 On 7/23/2012 8:52 AM, Yonik Seeley wrote:
 On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
 nnagaraja...@transaxtions.com   wrote:
 Realtime NRT algorithm enables NRT functionality in
 Solr by not closing the Searcher object  and so is very fast. I am in the
 process of contributing the algorithm back to Apache Solr as a patch.
 Since you're in the process of contributing this back, perhaps you
 could explain your approach - it never made sense to me.

 Replacing the reader in an existing SolrIndexSearcher as you do means
 that all the related caches will be invalid (meaning you can't use
 solr's caches).  You could just ensure that there is no auto-warming
 set up for Solr's caches (which is now the default), or you could
 disable caching altogether.  It's not clear what you're comparing
 against when you claim it's faster.
 Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher 
 object. All it does is override the IndexSearcher.getIndexReader() method so 
 as to supply a NRTReader if realtime is enabled. All direct references to the 
 reader member has been replaced with a getIndexReader() method access.

 The performance is better as SolrIndexSearcher is not closed every 1 sec as 
 in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is 
 reference counted. So every 1 sec this object needs to closed, re-allocated 
 and the indexes need to be re-opened, caches invalidated, while waiting for 
 existing searchers to complete, making this very expensive. realtime NRT does 
 not close the SolrIndexSearcher object but makes available a new NRTReader 
 with document updates ie. getIndexReader() returns a new NRTReader.

 There are also consistency and concurrency issues with replacing the
 reader in an existing SolrIndexSearcher, which is supposed to have a
 static view of the index.  If a reader replacement happens in the
 middle of a request, it's bound to cause trouble, including returning
 the wrong documents!
 The reader member is not replaced in the existing SolrIndexSearcher object. 
 The IndexSearcher.getIndexReader() method has been overriden in 
 SolrIndexSearcher and all direct reader member access has been replaced with 
 a getIndexReader() method call allowing a NRT reader to be supplied when 
 realtime is enabled. The concurrency is handled by the getNRTReader() method, 
 with the static index view now increased to the granularity provided by the 
 NRTIndexReader.


 Regards,

 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org

 -Yonik
 http://lucidimagination.com



Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Nagendra Nagarajayya

Mark,

Grant Ingersoll from ASF got in touch with me to ensure that I am 
compliant with the Apache Trade Mark. I made changes to the names, web 
pages, wiki, papers, etc. and sent back the links to Grant for approval. 
You may want to check with Grant.


Regarding the fork, I am not creating a fork but actually contributing 
the realtime NRT back to Apache Solr.  There was no NRT functionality in 
the older versions of Solr.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/25/2012 6:54 AM, Mark Miller wrote:

You are changing the name, or someone at Apache told you the current name is 
okay?

If someone at Apache told you it was okay, who was that?

You are certainly not using the Solr mark in an approved manner and I'd hope if 
you are going to take advantage of our mailing list for promotion of your 
product, that you would not violate our trademark. You are already on shaky 
ground promoting a Solr fork on the Solr mailing list by announcing every 
release - naming your fork something with Solr in it puts you over the edge on 
my list.

We don't allow people to name their products things like Solr: the wonder edition or 
anything along those lines. Solr is our trademark and third party products must have their own 
name. The only thing we allow is the phrase powered by Solr.

I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty hard 
to believe that anyone would suggest that your usage is a correct usage of the 
Solr trademark.

- Mark

On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:


Thanks Mark! I am already working with Apache Software Foundation on the mark 
and am using the correct usage of the mark as suggested by them.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/23/2012 12:15 PM, Mark Miller wrote:

On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:


I am not sure why any one will get offended by an announcement that NRT 
functionality was available with older releases.

FWIW, I'm not offended - I don't mind if third parties post announcements if 
they are related to Solr.

I just want to make sure it's very clear that it's a third party announce so 
there is no confusion - people that don't follow the lists on a daily basis 
read these things. A lot of these emails end up archived on various sites that 
collect mailing lists. It's easy to run into them without the proper context.

I think part of the confusion is the naming. Technically, Apache does not allow the use 
of Apache marks as part of a third party name. Instead, the name should be something like 
Product X, powered by Solr

See http://www.apache.org/foundation/marks/faq/#products

- Mark Miller
lucidimagination.com














- Mark Miller
lucidimagination.com

















Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-25 Thread Mark Miller
On Wed, Jul 25, 2012 at 11:03 AM, Nagendra Nagarajayya 
nnagaraja...@transaxtions.com wrote:

 Mark,

 Grant Ingersoll from ASF got in touch with me to ensure that I am
 compliant with the Apache Trade Mark. I made changes to the names, web
 pages, wiki, papers, etc. and sent back the links to Grant for approval.
 You may want to check with Grant.


Great, I'm glad to hear it. I didn't understand your original response with
regards to when you had spoken to someone and if a change was coming or you
thought you were already in compliance.



 Regarding the fork, I am not creating a fork but actually contributing the
 realtime NRT back to Apache Solr.  There was no NRT functionality in the
 older versions of Solr.


You have a fork now though - and forks are fine. Anyone should feel
comfortable forking Apache licensed code. I just want to make sure there is
no confusion about it - that is why we have the naming rules. If you end up
contributing code back, that is great, but it's a separate thing.




 Regards,

 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.**org http://rankingalgorithm.tgels.org



 On 7/25/2012 6:54 AM, Mark Miller wrote:

 You are changing the name, or someone at Apache told you the current name
 is okay?

 If someone at Apache told you it was okay, who was that?

 You are certainly not using the Solr mark in an approved manner and I'd
 hope if you are going to take advantage of our mailing list for promotion
 of your product, that you would not violate our trademark. You are already
 on shaky ground promoting a Solr fork on the Solr mailing list by
 announcing every release - naming your fork something with Solr in it puts
 you over the edge on my list.

 We don't allow people to name their products things like Solr: the
 wonder edition or anything along those lines. Solr is our trademark and
 third party products must have their own name. The only thing we allow is
 the phrase powered by Solr.

 I'm on the Lucene/Solr PMC and am an Apache member and I'd find it pretty
 hard to believe that anyone would suggest that your usage is a correct
 usage of the Solr trademark.

 - Mark

 On Jul 24, 2012, at 8:36 AM, Nagendra Nagarajayya wrote:

  Thanks Mark! I am already working with Apache Software Foundation on the
 mark and am using the correct usage of the mark as suggested by them.

 Regards,

 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.**org http://rankingalgorithm.tgels.org


 On 7/23/2012 12:15 PM, Mark Miller wrote:

 On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:

  I am not sure why any one will get offended by an announcement that
 NRT functionality was available with older releases.

 FWIW, I'm not offended - I don't mind if third parties post
 announcements if they are related to Solr.

 I just want to make sure it's very clear that it's a third party
 announce so there is no confusion - people that don't follow the lists on a
 daily basis read these things. A lot of these emails end up archived on
 various sites that collect mailing lists. It's easy to run into them
 without the proper context.

 I think part of the confusion is the naming. Technically, Apache does
 not allow the use of Apache marks as part of a third party name. Instead,
 the name should be something like Product X, powered by Solr

 See 
 http://www.apache.org/**foundation/marks/faq/#productshttp://www.apache.org/foundation/marks/faq/#products

 - Mark Miller
 lucidimagination.com













  - Mark Miller
 lucidimagination.com

















-- 
- Mark

http://www.lucidimagination.com


Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Nagendra Nagarajayya

Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:

On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com  wrote:

Realtime NRT algorithm enables NRT functionality in
Solr by not closing the Searcher object  and so is very fast. I am in the
process of contributing the algorithm back to Apache Solr as a patch.

Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.


Solr with RankingAlgorithm does not replace the reader in 
SolrIndexSearcher object. All it does is override the 
IndexSearcher.getIndexReader() method so as to supply a NRTReader if 
realtime is enabled. All direct references to the reader member has 
been replaced with a getIndexReader() method access.


The performance is better as SolrIndexSearcher is not closed every 1 sec 
as in soft-commit. SolrIndexSearcher is a heavy object with caches, etc. 
and is reference counted. So every 1 sec this object needs to closed, 
re-allocated and the indexes need to be re-opened, caches invalidated, 
while waiting for existing searchers to complete, making this very 
expensive. realtime NRT does not close the SolrIndexSearcher object but 
makes available a new NRTReader with document updates ie. 
getIndexReader() returns a new NRTReader.



There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!


The reader member is not replaced in the existing SolrIndexSearcher 
object. The IndexSearcher.getIndexReader() method has been overriden in 
SolrIndexSearcher and all direct reader member access has been replaced 
with a getIndexReader() method call allowing a NRT reader to be supplied 
when realtime is enabled. The concurrency is handled by the 
getNRTReader() method, with the static index view now increased to the 
granularity provided by the NRTIndexReader.



Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


-Yonik
http://lucidimagination.com







Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Yonik Seeley
On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com wrote:
 SolrIndexSearcher is a heavy object with caches, etc.

As I've said, the caches are configurable, and it's trivial to disable
all caching (to the point where the cache objects are not even
created).

 The reader member is not replaced in the existing SolrIndexSearcher object.
 The IndexSearcher.getIndexReader() method has been overriden in
 SolrIndexSearcher and all direct reader member access has been replaced with
 a getIndexReader() method call allowing a NRT reader to be supplied when
 realtime is enabled.

In a single Solr request (that runs through multiple components like
query, highlight, facet, and response writing),
does IndexSearcher.getIndexReader() always return the same reader?  If
not, this breaks pretty much every standard solr component - but it
will only be apparent under load, and if you are carefully sanity
checking the results.

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Nagendra Nagarajayya
Thanks Mark! I am already working with Apache Software Foundation on the 
mark and am using the correct usage of the mark as suggested by them.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/23/2012 12:15 PM, Mark Miller wrote:

On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:


I am not sure why any one will get offended by an announcement that NRT 
functionality was available with older releases.

FWIW, I'm not offended - I don't mind if third parties post announcements if 
they are related to Solr.

I just want to make sure it's very clear that it's a third party announce so 
there is no confusion - people that don't follow the lists on a daily basis 
read these things. A lot of these emails end up archived on various sites that 
collect mailing lists. It's easy to run into them without the proper context.

I think part of the confusion is the naming. Technically, Apache does not allow the use 
of Apache marks as part of a third party name. Instead, the name should be something like 
Product X, powered by Solr

See http://www.apache.org/foundation/marks/faq/#products

- Mark Miller
lucidimagination.com

















Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Andy
Nagendra,

Does RankingAlgorithm work with faceting which requires the use of cache? As 
new documents are added or updated, the cache will be constantly invalidated. 
So how would RankingAlgorithm work in this case?



 From: Nagendra Nagarajayya nnagaraja...@transaxtions.com
To: solr-user@lucene.apache.org 
Sent: Tuesday, July 24, 2012 8:24 AM
Subject: Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with 
Realtime NRT available for download
 
Hi Yonik:

Please see my comments below:

On 7/23/2012 8:52 AM, Yonik Seeley wrote:
 On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
 nnagaraja...@transaxtions.com  wrote:
 Realtime NRT algorithm enables NRT functionality in
 Solr by not closing the Searcher object  and so is very fast. I am in the
 process of contributing the algorithm back to Apache Solr as a patch.
 Since you're in the process of contributing this back, perhaps you
 could explain your approach - it never made sense to me.
 
 Replacing the reader in an existing SolrIndexSearcher as you do means
 that all the related caches will be invalid (meaning you can't use
 solr's caches).  You could just ensure that there is no auto-warming
 set up for Solr's caches (which is now the default), or you could
 disable caching altogether.  It's not clear what you're comparing
 against when you claim it's faster.

Solr with RankingAlgorithm does not replace the reader in SolrIndexSearcher 
object. All it does is override the IndexSearcher.getIndexReader() method so as 
to supply a NRTReader if realtime is enabled. All direct references to the 
reader member has been replaced with a getIndexReader() method access.

The performance is better as SolrIndexSearcher is not closed every 1 sec as in 
soft-commit. SolrIndexSearcher is a heavy object with caches, etc. and is 
reference counted. So every 1 sec this object needs to closed, re-allocated and 
the indexes need to be re-opened, caches invalidated, while waiting for 
existing searchers to complete, making this very expensive. realtime NRT does 
not close the SolrIndexSearcher object but makes available a new NRTReader with 
document updates ie. getIndexReader() returns a new NRTReader.

 There are also consistency and concurrency issues with replacing the
 reader in an existing SolrIndexSearcher, which is supposed to have a
 static view of the index.  If a reader replacement happens in the
 middle of a request, it's bound to cause trouble, including returning
 the wrong documents!

The reader member is not replaced in the existing SolrIndexSearcher object. The 
IndexSearcher.getIndexReader() method has been overriden in SolrIndexSearcher 
and all direct reader member access has been replaced with a getIndexReader() 
method call allowing a NRT reader to be supplied when realtime is enabled. The 
concurrency is handled by the getNRTReader() method, with the static index view 
now increased to the granularity provided by the NRTIndexReader.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

 -Yonik
 http://lucidimagination.com
 
 

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Mark Miller
These emails from Nagendra are very confusing. I've asked him in the past to be 
explicit about his announce and make it clear that it is an external project.

Since I don't think he has changed how he does announce since that request, 
allow me to help out:

Please note: This project has nothing to do with Apache. It is a completely 
external project that apparently uses Apache Solr.

It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's 
simply a project that an external user is promoting on the Solr mailing list.

- Mark Miller
lucidimagination.com

On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:

 Hi!
 
 I am very excited to announce the availability of Solr 4.0-ALPHA with 
 RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now 
 supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance 
 and more granular NRT implementation as to soft commit. The update 
 performance is about 70,000 documents / sec*. You can also scale up to 2 
 billion documents* in a single core, and query half a billion documents index 
 in ms**.
 
 RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
 boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
 
 You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 
 Realtime performance from here:
 http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
 
 You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
 http://solr-ra.tgels.org
 
 Please download and give the new version a try.
 
 Regards,
 
 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org
 
 * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 
 1.4.3
 ** performance seen when using the age feature
 














Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Bernd Fehling
+1

What would be if ALL external projects using lucene and/or solr are announcing 
on this list
that they have stepped up to the next higher release after a release change?

Also Realtime NRT, if NRT stands for Near_Real_Time he has a Realtime 
Near_Real_Time Algorithm.

Regards,
Bernd


Am 23.07.2012 14:09, schrieb Mark Miller:
 These emails from Nagendra are very confusing. I've asked him in the past to 
 be explicit about his announce and make it clear that it is an external 
 project.
 
 Since I don't think he has changed how he does announce since that request, 
 allow me to help out:
 
 Please note: This project has nothing to do with Apache. It is a completely 
 external project that apparently uses Apache Solr.
 
 It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's 
 simply a project that an external user is promoting on the Solr mailing list.
 
 - Mark Miller
 lucidimagination.com
 
 On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:
 
 Hi!

 I am very excited to announce the availability of Solr 4.0-ALPHA with 
 RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation 
 now supports both RankingAlgorithm and Lucene. Realtime NRT is a high 
 performance and more granular NRT implementation as to soft commit. The 
 update performance is about 70,000 documents / sec*. You can also scale up 
 to 2 billion documents* in a single core, and query half a billion documents 
 index in ms**.

 RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
 boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

 You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 
 1.4.4 Realtime performance from here:
 http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

 You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
 http://solr-ra.tgels.org

 Please download and give the new version a try.

 Regards,

 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org

 * performance seen at a user installation of Solr 4.0 with RankingAlgorithm 
 1.4.3
 ** performance seen when using the age feature




Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Stefan Moises
+1, too... very confusing announcements, both because of the official 
sounding posts and also the double-realtime name :P
And he also says in a follow-up post I am contributing Realtime NRT to 
Solr 4.0..., which sounds like this feature will be available in the 
official 4.x Solr release, which makes it even more confusing.


The project itself sounds cool, though.

Cheers,
Stefan
Am 23.07.2012 16:01, schrieb Bernd Fehling:

+1

What would be if ALL external projects using lucene and/or solr are announcing 
on this list
that they have stepped up to the next higher release after a release change?

Also Realtime NRT, if NRT stands for Near_Real_Time he has a Realtime 
Near_Real_Time Algorithm.

Regards,
Bernd


Am 23.07.2012 14:09, schrieb Mark Miller:

These emails from Nagendra are very confusing. I've asked him in the past to be 
explicit about his announce and make it clear that it is an external project.

Since I don't think he has changed how he does announce since that request, 
allow me to help out:

Please note: This project has nothing to do with Apache. It is a completely 
external project that apparently uses Apache Solr.

It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's 
simply a project that an external user is promoting on the Solr mailing list.

- Mark Miller
lucidimagination.com

On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:


Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with 
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now 
supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance 
and more granular NRT implementation as to soft commit. The update performance 
is about 70,000 documents / sec*. You can also scale up to 2 billion documents* 
in a single core, and query half a billion documents index in ms**.

RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 
Realtime performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with RankingAlgorithm 
1.4.3
** performance seen when using the age feature





--
Mit den besten Grüßen aus Nürnberg,
Stefan Moises

***
Stefan Moises
Senior Softwareentwickler
Leiter Modulentwicklung

shoptimax GmbH
Guntherstraße 45 a
90461 Nürnberg
Amtsgericht Nürnberg HRB 21703
GF Friedrich Schreieck

Tel.: 0911/25566-0
Fax:  0911/25566-29
moi...@shoptimax.de
http://www.shoptimax.de
***




Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Nagendra Nagarajayya
I like Mark's suggestion of marking the announcement as an external 
project. Will add that to future announcements.


Regarding the announcement itself, Apache Solr with RankingAlgorithm has 
made available NRT functionality to Apache Solr  from version 1.4.1. 
There were lots of requests/JIRAs for this functionality (sometime back) 
which had not been addressed in Solr. So announcing on this list  to let 
know everyone in the community that this functionality is available with 
Apache Solr, is the right way to do it right ? The whole list is made up 
of developers who are using Apache Solr and who are interested in 
hearing about Apache Solr related stuff. I am not sure why any one will 
get offended by an announcement that NRT functionality was available 
with older releases. Apache Solr  4.0 does support NRT functionality now 
with soft commit but Realtime NRT is another way of providing the 
realtime functionality (much faster than soft commit). The breath of 
Apache Software Foundation is for innovation to come in not only from 
organized groups as Apache Solr or Apache Lucene but also from 
individuals, small business or even large well funded business. The ASF 
license also promotes that innovation may not be masked and provides 
ways to bundle closed source with open source.  Apache Solr with 
RankingAlgorithm is available for free to everyone. It will provide 
innovative ways to search that may not be available with regular Apache 
Solr. So I think it is fair to announce a new release on Apache Solr 
mailing list.


This announcement was made as Apache Solr 4.0-ALPHA is a major milestone 
Solr release. This would be similar to Python support for Apache Solr or 
other announcements related to Apache Solr being announced on this list.



Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/23/2012 5:09 AM, Mark Miller wrote:

These emails from Nagendra are very confusing. I've asked him in the past to be 
explicit about his announce and make it clear that it is an external project.

Since I don't think he has changed how he does announce since that request, 
allow me to help out:

Please note: This project has nothing to do with Apache. It is a completely 
external project that apparently uses Apache Solr.

It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's 
simply a project that an external user is promoting on the Solr mailing list.

- Mark Miller
lucidimagination.com

On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:


Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with 
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now 
supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance 
and more granular NRT implementation as to soft commit. The update performance 
is about 70,000 documents / sec*. You can also scale up to 2 billion documents* 
in a single core, and query half a billion documents index in ms**.

RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 
Realtime performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with RankingAlgorithm 
1.4.3
** performance seen when using the age feature



















Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Nagendra Nagarajayya
Thanks Bernd! Apache Solr 4.0-ALPHA is a major Solr  milestone release 
so I think you will find lots of announcements related to it, like 
python support, etc. Similarly Apache Solr with RankinAlgorithm release.


Realtime NRT is a innovative way to provide NRT functionality to Solr. 
Realtime is the name of the tag used in solrconfig.xml to turn on this 
functionality. I had not named the previous releases but decided to name 
it from this release so as to differentiate the NRT functionality from 
the one provided by soft-commit. Realtime NRT algorithm enables NRT 
functionality in Solr by not closing the Searcher object  and so is very 
fast. I am in the process of contributing the algorithm back to Apache 
Solr as a patch.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org



On 7/23/2012 7:01 AM, Bernd Fehling wrote:

+1

What would be if ALL external projects using lucene and/or solr are announcing 
on this list
that they have stepped up to the next higher release after a release change?

Also Realtime NRT, if NRT stands for Near_Real_Time he has a Realtime 
Near_Real_Time Algorithm.

Regards,
Bernd


Am 23.07.2012 14:09, schrieb Mark Miller:

These emails from Nagendra are very confusing. I've asked him in the past to be 
explicit about his announce and make it clear that it is an external project.

Since I don't think he has changed how he does announce since that request, 
allow me to help out:

Please note: This project has nothing to do with Apache. It is a completely 
external project that apparently uses Apache Solr.

It's not supported by or endorsed by Apache or the Lucene/Solr projects. It's 
simply a project that an external user is promoting on the Solr mailing list.

- Mark Miller
lucidimagination.com

On Jul 22, 2012, at 5:07 PM, Nagendra Nagarajayya wrote:


Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with 
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT implementation now 
supports both RankingAlgorithm and Lucene. Realtime NRT is a high performance 
and more granular NRT implementation as to soft commit. The update performance 
is about 70,000 documents / sec*. You can also scale up to 2 billion documents* 
in a single core, and query half a billion documents index in ms**.

RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 
Realtime performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with RankingAlgorithm 
1.4.3
** performance seen when using the age feature








Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Yonik Seeley
On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya
nnagaraja...@transaxtions.com wrote:
 Realtime NRT algorithm enables NRT functionality in
 Solr by not closing the Searcher object  and so is very fast. I am in the
 process of contributing the algorithm back to Apache Solr as a patch.

Since you're in the process of contributing this back, perhaps you
could explain your approach - it never made sense to me.

Replacing the reader in an existing SolrIndexSearcher as you do means
that all the related caches will be invalid (meaning you can't use
solr's caches).  You could just ensure that there is no auto-warming
set up for Solr's caches (which is now the default), or you could
disable caching altogether.  It's not clear what you're comparing
against when you claim it's faster.

There are also consistency and concurrency issues with replacing the
reader in an existing SolrIndexSearcher, which is supposed to have a
static view of the index.  If a reader replacement happens in the
middle of a request, it's bound to cause trouble, including returning
the wrong documents!

-Yonik
http://lucidimagination.com


Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Mark Miller

On Jul 23, 2012, at 11:27 AM, Nagendra Nagarajayya wrote:

 I am not sure why any one will get offended by an announcement that NRT 
 functionality was available with older releases.

FWIW, I'm not offended - I don't mind if third parties post announcements if 
they are related to Solr.

I just want to make sure it's very clear that it's a third party announce so 
there is no confusion - people that don't follow the lists on a daily basis 
read these things. A lot of these emails end up archived on various sites that 
collect mailing lists. It's easy to run into them without the proper context.

I think part of the confusion is the naming. Technically, Apache does not allow 
the use of Apache marks as part of a third party name. Instead, the name should 
be something like Product X, powered by Solr

See http://www.apache.org/foundation/marks/faq/#products

- Mark Miller
lucidimagination.com













[Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-22 Thread Nagendra Nagarajayya

Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with 
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT 
implementation now supports both RankingAlgorithm and Lucene. Realtime 
NRT is a high performance and more granular NRT implementation as to 
soft commit. The update performance is about 70,000 documents / sec*. 
You can also scale up to 2 billion documents* in a single core, and 
query half a billion documents index in ms**.


RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.


You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 
1.4.4 Realtime performance from here:

http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with 
RankingAlgorithm 1.4.3

** performance seen when using the age feature



Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-22 Thread Darren Govoni
What exactly is Realtime NRT (Near Real Time)?

On Sun, 2012-07-22 at 14:07 -0700, Nagendra Nagarajayya wrote:

 Hi!
 
 I am very excited to announce the availability of Solr 4.0-ALPHA with 
 RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT 
 implementation now supports both RankingAlgorithm and Lucene. Realtime 
 NRT is a high performance and more granular NRT implementation as to 
 soft commit. The update performance is about 70,000 documents / sec*. 
 You can also scale up to 2 billion documents* in a single core, and 
 query half a billion documents index in ms**.
 
 RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or 
 boolean queries and is compatible with the new Lucene 4.0-ALPHA api.
 
 You can get more information about Solr 4.0-ALPHA with RankingAlgorithm 
 1.4.4 Realtime performance from here:
 http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x
 
 You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
 http://solr-ra.tgels.org
 
 Please download and give the new version a try.
 
 Regards,
 
 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://rankingalgorithm.tgels.org
 
 * performance seen at a user installation of Solr 4.0 with 
 RankingAlgorithm 1.4.3
 ** performance seen when using the age feature
 




Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-22 Thread Nagendra Nagarajayya
Realtime NRT is a NRT implementation available for Solr 1.4.1 to Solr 
4.0. To enable NRT it makes available a NRTIndexReader to the 
IndexSearcher for searching the index.  It does not close the 
SolrIndexSearcher which is a very heavy object with caches, etc. to do 
this. Since the Searcher is never closed it always uses the most recent 
NRTIndexReader for searching and you get a pipe that is always filled 
with new updated documents. The code changes are to handle this dynamic 
pipe that may always have something new as in a realtime system.


Realtime NRT is different from soft commit as it does not close the 
SolrIndexSearcher object every 1000 secs, invalidating the caches, etc. 
SolrIndexSearcher is a very heavy object, ref. counted with caches, etc. 
Closing it every time may turn out to be expensive.


I am contributing Realtime NRT to Solr 4.0 and am working on  making 
available a patch, etc.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org


On 7/22/2012 2:03 PM, Darren Govoni wrote:

What exactly is Realtime NRT (Near Real Time)?

On Sun, 2012-07-22 at 14:07 -0700, Nagendra Nagarajayya wrote:


Hi!

I am very excited to announce the availability of Solr 4.0-ALPHA with
RankingAlgorithm 1.4.4 with Realtime NRT. The Realtime NRT
implementation now supports both RankingAlgorithm and Lucene. Realtime
NRT is a high performance and more granular NRT implementation as to
soft commit. The update performance is about 70,000 documents / sec*.
You can also scale up to 2 billion documents* in a single core, and
query half a billion documents index in ms**.

RankingAlgorithm 1.4.4 supports the entire Lucene Query Syntax, ± and/or
boolean queries and is compatible with the new Lucene 4.0-ALPHA api.

You can get more information about Solr 4.0-ALPHA with RankingAlgorithm
1.4.4 Realtime performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_4.x

You can download Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org

* performance seen at a user installation of Solr 4.0 with
RankingAlgorithm 1.4.3
** performance seen when using the age feature