[jira] Resolved: (SOLR-964) XPathEntityProcessor must ignore DOCTYPE definitions

2009-01-22 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-964.


Resolution: Fixed

Committed revision 736616.

Thanks Fergus and Noble!

 XPathEntityProcessor must ignore DOCTYPE definitions
 

 Key: SOLR-964
 URL: https://issues.apache.org/jira/browse/SOLR-964
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-964.patch


 DTD validation is expensive and sometimes it fails to import data. As DIH 
 does not have to validate any DTD it should ignore it 
 A user reported an issue
 http://markmail.org/message/izchgxdmulldpwpn

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Closed: (SOLR-930) SolrCore.close() : Warn in the logger when the internal reference count is 0

2009-01-22 Thread Kay Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kay Kay closed SOLR-930.



| Kay: @see tags can't contain {...@link} tags (at least not in Javadoc 1.5) 

Oops - I was using java 6 (javadoc compiler). Thanks for the clarification in 
the docs. 

 SolrCore.close() : Warn in the logger when the internal reference count is  0
 --

 Key: SOLR-930
 URL: https://issues.apache.org/jira/browse/SOLR-930
 Project: Solr
  Issue Type: Improvement
 Environment: Java 6, Tomcat 6 
Reporter: Kay Kay
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-930.patch, SOLR-930.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 SolrCore.close() - Add a warning statement when the internal reference count 
 is  0. ( as opposed to 0, as expected ) - 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-976) deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a delete

2009-01-22 Thread Koji Sekiguchi (JIRA)
deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a 
delete 


 Key: SOLR-976
 URL: https://issues.apache.org/jira/browse/SOLR-976
 Project: Solr
  Issue Type: Bug
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 1.4


Due to the following if block, deleteByQuery cannot be executed. cmd.id and 
cmd.query should be set to null when UpdateProcessor chain is finished.

{code:title=RunUpdateProcessor}
public void processDelete(DeleteUpdateCommand cmd) throws IOException {
  if( cmd.id != null ) {
updateHandler.delete(cmd);
  }
  else {
updateHandler.deleteByQuery(cmd);
  }
  super.processDelete(cmd);
}
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-976) deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in a delete

2009-01-22 Thread Koji Sekiguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated SOLR-976:


Attachment: SOLR-976.patch

 deleteByQuery is ignored when deleteById is placed prior to deleteByQuery in 
 a delete 
 

 Key: SOLR-976
 URL: https://issues.apache.org/jira/browse/SOLR-976
 Project: Solr
  Issue Type: Bug
Reporter: Koji Sekiguchi
Priority: Trivial
 Fix For: 1.4

 Attachments: SOLR-976.patch


 Due to the following if block, deleteByQuery cannot be executed. cmd.id and 
 cmd.query should be set to null when UpdateProcessor chain is finished.
 {code:title=RunUpdateProcessor}
 public void processDelete(DeleteUpdateCommand cmd) throws IOException {
   if( cmd.id != null ) {
 updateHandler.delete(cmd);
   }
   else {
 updateHandler.deleteByQuery(cmd);
   }
   super.processDelete(cmd);
 }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-972) EventListener-s creation changed from a per request ( full / delta-imports) scenario to once through the lifetime of the DIH plugin.

2009-01-22 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666193#action_12666193
 ] 

Kay Kay commented on SOLR-972:
--

I agree with the comment regarding a custom scope for a Context . For those 
Context-s that need to be reused this could still be in scope before fetching 
to avoid recreating them again but I am more concerned about recreating 
EventListener-s . 

| It is still possible for you to share Objects in a static variable in your 
EventListener.
| The design is modeled like the servlet API. This is akin to storing and 
retrieving data from the servletContext,session,request etc . 

Sharing via static variables does not seem to be the cleanest way to design . 
(What if there are 2 eventListeners one for start and another for end 
inheriting from a common class that has shared attributes.  Sharing via static 
variables ( in the base class) brings unpredictable behavior / and a code 
difficult to maintain . )

Imagine a requesthandler in the Servlet having no state and being instantiated 
every time for every request. By recreating EventListener-s , we have a similar 
analogy . That would make (it already does and I am working on patched version 
of Solr) development harder to share any state between successive calls. 

 EventListener-s creation changed from a per request ( full / delta-imports) 
 scenario to once through the lifetime of the DIH plugin.
 

 Key: SOLR-972
 URL: https://issues.apache.org/jira/browse/SOLR-972
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
 Environment: Java 6, Tomcat 6
Reporter: Kay Kay
 Fix For: 1.4

 Attachments: SOLR-972.patch

   Original Estimate: 2h
  Remaining Estimate: 2h

 The EventListener plugin for notification of start / end import events 
 (SOLR-938) creates an instance of EventListener before every notification. 
 This has 2 drawbacks. 
 * No state is stored between successive invocations of events as it is a new 
 object 
 * When writing plugins for delta imports - it is very inefficient to do a 
 class loader lookup by reflection / instantiate an instance and call a method 
 on the same. 
 Attached patch has one EventListener through the lifetime of the DIH plugin . 
 Also EventListener is changed to an interface rather than an abstract class 
 for better decoupling (especially painful when the start/end eventlistener 
 has an independent hierarchy by itself ). 
 By default, a no-op listener is registered to avoid boiler plate code to 
 check if there is a start / end listener specified.  Efficient JRE impls 
 should be able to optimize the no-op for minimum overhead compared to 
 checking the reference for null and branching out. 
 Specifying an onImportStart / onImportEnd overrides the default handler 
 though. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-969) Context#FULL_DUMP, DELTA_DUMP, FIND_DELTA as an enum

2009-01-22 Thread Kay Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666203#action_12666203
 ] 

Kay Kay commented on SOLR-969:
--

I agree String-s are portable. But choosing an enum would still be orthogonal 
to Strings thanks to the Enum.valueOf(String) method - We can still regenerate 
the enum from the String representation (in case of non-Java languages). 

In any case - Context seems to be restricted to DIH and seems to be available 
since 1.3  only . So making the change now might be more intuitive to the 
developer instead of misusing with integers not meant to be passed as an 
argument in the first place. 

 Context#FULL_DUMP, DELTA_DUMP, FIND_DELTA as an enum 
 -

 Key: SOLR-969
 URL: https://issues.apache.org/jira/browse/SOLR-969
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
 Environment: Java 6, Tomcat 6
Reporter: Kay Kay
 Fix For: 1.4

 Attachments: SOLR-969.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Context # FULL_DUMP,  DELTA_DUMP, FIND_DELTA made type-safe as an enum. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666255#action_12666255
 ] 

Shalin Shekhar Mangar commented on SOLR-844:


Guys, are there any objections in committing this to trunk? If not, I'd like to 
go over the patch and commit it in a few days. I don't think this needs to go 
to a separate contrib since it is such a small piece of code and doesn't have 
any extra dependencies.

 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666296#action_12666296
 ] 

Otis Gospodnetic commented on SOLR-844:
---

I'm not sure there is a clear consensus about this functionality being a good 
thing.  Perhaps we can get more people's opinions?


 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666296#action_12666296
 ] 

otis edited comment on SOLR-844 at 1/22/09 1:12 PM:


I'm not sure there is a clear consensus about this functionality being a good 
thing (also 0 votes).  Perhaps we can get more people's opinions?


  was (Author: otis):
I'm not sure there is a clear consensus about this functionality being a 
good thing.  Perhaps we can get more people's opinions?

  
 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Issue Comment Edited: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Walter Underwood
This would be useful if there was search-specific balancing,
like always send the same query back to the same server. That
can make your cache far more effective.

wunder

On 1/22/09 1:13 PM, Otis Gospodnetic (JIRA) j...@apache.org wrote:

 
 [ 
 https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.
 system.issuetabpanels:comment-tabpanelfocusedCommentId=12666296#action_126662
 96 ] 
 
 otis edited comment on SOLR-844 at 1/22/09 1:12 PM:
 
 
 I'm not sure there is a clear consensus about this functionality being a good
 thing (also 0 votes).  Perhaps we can get more people's opinions?
 
 
   was (Author: otis):
 I'm not sure there is a clear consensus about this functionality being a
 good thing.  Perhaps we can get more people's opinions?
 
   
 A SolrServer impl to front-end multiple urls
 
 
 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4
 
 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch
 
 
 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This
 demands that the user have a LoadBalancer or do the roundrobin on their own.
 We must have a {{LBHttpSolrServer}} which must automatically do a
 Loadbalancing between multiple hosts. This can be backed by the
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that
 the server list can be automatically updated  by periodically loading the
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin,
 random etc)
 * Pluggable Failover mechanisms



Re: [jira] Created: (SOLR-308) Add a field that generates an unique id when you have none in your data to index

2009-01-22 Thread tyball

 i have a simple test multiple core configuration, each with a uuid field
type  as unique id( using default=NEW).   when i use an query that uses
shards i always get an exeception that the uuid is not valid(it has
something to do with the tostring of UUIDField to retreive the UUID suddenly
contains a java.util.UUID+the actual UUID . i pathed it myself and removed 
java.util.UUID substring from the UUID and everything is fine. everything
works fine when not using shards in the query or not using uuid's afcourse.


-- 
View this message in context: 
http://www.nabble.com/-jira--Created%3A-%28SOLR-308%29-Add-a-field-that-generates-an-unique-id-when-you-have-none-in-your-data-to-index-tp11663589p21617406.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666409#action_12666409
 ] 

Shalin Shekhar Mangar commented on SOLR-844:


Walter Underwood on solr-dev:
bq. This would be useful if there was search-specific balancing,
like always send the same query back to the same server. That
can make your cache far more effective.

That is an interesting thought. Right now Solrj does not have the means to 
construct the Query/Filter objects which are used as the key for the Solr 
caches. Let us try to figure out if/how it can be implemented.

bq. I'm not sure there is a clear consensus about this functionality being a 
good thing (also 0 votes). Perhaps we can get more people's opinions?

Yes Otis. That is exactly what I wanted to do with my comment :)

I guess most people think that this is a solved problem and I agree. Solr has 
always required users to have load balancers and I guess our users have come to 
accept it. But if you look at the new distributed systems being developed 
(Katta/Bailey/CouchDB/Voldemort, etc.), almost none of them assume an external 
load balancer or fail over system in their design. If they do, external systems 
are made optional (Voldemort). Right now we force people to use and maintain an 
external system if they have a Solr master/slave architecture. Apache and 
mod_proxy_balancer are great but have a higher latency than hardware based load 
balancers and these are not cheap.

The current patch is very simple. However, better ways to handle configuration 
and load balancing can be added. I think this can be a good starting point for 
the Solr 2.0 proposals. Unfortunately, very few of us have time to implement 
all of the proposals in a big bang way. But if we focus on one issue at a time, 
we will get somewhere close sooner. This might not be the code which ends up in 
Solr 2.0 but I think it can provide a good transition path.

Thoughts?

 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666409#action_12666409
 ] 

shalinmangar edited comment on SOLR-844 at 1/22/09 7:46 PM:
-

Walter Underwood on solr-dev:
bq. This would be useful if there was search-specific balancing, like always 
send the same query back to the same server. That can make your cache far more 
effective.

That is an interesting thought. Right now Solrj does not have the means to 
construct the Query/Filter objects which are used as the key for the Solr 
caches. Let us try to figure out if/how it can be implemented.

bq. I'm not sure there is a clear consensus about this functionality being a 
good thing (also 0 votes). Perhaps we can get more people's opinions?

Yes Otis. That is exactly what I wanted to do with my comment :)

I guess most people think that this is a solved problem and I agree. Solr has 
always required users to have load balancers and I guess our users have come to 
accept it. But if you look at the new distributed systems being developed 
(Katta/Bailey/CouchDB/Voldemort, etc.), almost none of them assume an external 
load balancer or fail over system in their design. If they do, external systems 
are made optional (Voldemort). Right now we force people to use and maintain an 
external system if they have a Solr master/slave architecture. Apache and 
mod_proxy_balancer are great but have a higher latency than hardware based load 
balancers and these are not cheap.

The current patch is very simple. However, better ways to handle configuration 
and load balancing can be added. I think this can be a good starting point for 
the Solr 2.0 proposals. Unfortunately, very few of us have time to implement 
all of the proposals in a big bang way. But if we focus on one issue at a time, 
we will get somewhere close sooner. This might not be the code which ends up in 
Solr 2.0 but I think it can provide a good transition path.

Thoughts?

  was (Author: shalinmangar):
Walter Underwood on solr-dev:
bq. This would be useful if there was search-specific balancing,
like always send the same query back to the same server. That
can make your cache far more effective.

That is an interesting thought. Right now Solrj does not have the means to 
construct the Query/Filter objects which are used as the key for the Solr 
caches. Let us try to figure out if/how it can be implemented.

bq. I'm not sure there is a clear consensus about this functionality being a 
good thing (also 0 votes). Perhaps we can get more people's opinions?

Yes Otis. That is exactly what I wanted to do with my comment :)

I guess most people think that this is a solved problem and I agree. Solr has 
always required users to have load balancers and I guess our users have come to 
accept it. But if you look at the new distributed systems being developed 
(Katta/Bailey/CouchDB/Voldemort, etc.), almost none of them assume an external 
load balancer or fail over system in their design. If they do, external systems 
are made optional (Voldemort). Right now we force people to use and maintain an 
external system if they have a Solr master/slave architecture. Apache and 
mod_proxy_balancer are great but have a higher latency than hardware based load 
balancers and these are not cheap.

The current patch is very simple. However, better ways to handle configuration 
and load balancing can be added. I think this can be a good starting point for 
the Solr 2.0 proposals. Unfortunately, very few of us have time to implement 
all of the proposals in a big bang way. But if we focus on one issue at a time, 
we will get somewhere close sooner. This might not be the code which ends up in 
Solr 2.0 but I think it can provide a good transition path.

Thoughts?
  
 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing 

[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666413#action_12666413
 ] 

Noble Paul commented on SOLR-844:
-

bq.I'm not sure there is a clear consensus about this functionality being a 
good thing

This is not a Solr functionality. This is an external feature. Solr is agnostic 
about it.

Most of our users use Solr with an external LoadBalancer. It does not mean that 
that is the best solution always. They do it because that is the only way .We 
are trying to offer a choice. 


A very good comparison would be memcached client . It does load balancing at 
the client side . I have copied some ideas from there too. I have already 
mentioned the example of mysql.

The problems with the existing solutions is that there is no automatic 
failover. This implementation has it. 






 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666416#action_12666416
 ] 

Mark Miller commented on SOLR-844:
--

I think your missing Otis' point Noble. He is not dismissing this patch on its 
technical or useful merits. Hes pointing out that a couple of people have 
voiced skepticism and no one has voted for the issue. When thats the case, its 
not normal to put the issue in without more discussion. Which is what is 
happening, but I don't think your arguments alone should get the code 
committed. Rather, after you have expressed your arguments, we wait for votes, 
or more input. A couple people seem like they like the idea as well, but that 
info has just started coming, so lets let it play out a little before 
committing.

My opinion:

My initial thought was negative as well, for the obvious reasons. However, its 
such a simple thing (at least for this basic support), improves efficiency a 
bit, and could be a lot easier for a solr user to setup than a load balancer 
they don't know. I think I am a +1 myself.

On the patch: I havn't spent a lot of time looking at it, but I think its best 
practice to only access shared variables within the lock. For instance, you 
access isEmpty on a shared variable, then get the lock and access the shared 
variable. I realize you want to save the lock cost, but if you need the lock, 
you shouldn't do that.

- Mark

 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-844) A SolrServer impl to front-end multiple urls

2009-01-22 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666417#action_12666417
 ] 

Noble Paul commented on SOLR-844:
-

Hi Mark, thanks

I am in no hurry in getting this committed. I am just trying to raise opinions 
from other committers/users. 

bq.For instance, you access isEmpty on a shared variable, then get the lock and 
access the shared variable

The Object in question is a threadsafe object (CopyOnWriteArrayList) so I 
believe it should not hurt. 



 A SolrServer impl to front-end multiple urls
 

 Key: SOLR-844
 URL: https://issues.apache.org/jira/browse/SOLR-844
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.3
Reporter: Noble Paul
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: SOLR-844.patch, SOLR-844.patch, SOLR-844.patch


 Currently a {{CommonsHttpSolrServer}} can talk to only one server. This 
 demands that the user have a LoadBalancer or do the roundrobin on their own. 
 We must have a {{LBHttpSolrServer}} which must automatically do a 
 Loadbalancing between multiple hosts. This can be backed by the 
 {{CommonsHttpSolrServer}}
 This can have the following other features
 * Automatic failover
 * Optionally take in  a file /url containing the the urls of servers so that 
 the server list can be automatically updated  by periodically loading the 
 config
 * Support for adding removing servers during runtime
 * Pluggable Loadbalancing mechanism. (round-robin, weighted round-robin, 
 random etc)
 * Pluggable Failover mechanisms

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-977) version=2.2 is not necessary for wt=javabin

2009-01-22 Thread Noble Paul (JIRA)
version=2.2 is not necessary for wt=javabin
---

 Key: SOLR-977
 URL: https://issues.apache.org/jira/browse/SOLR-977
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Noble Paul
Priority: Trivial


CommonsHttpSolrServer can drop the version=2.2 if the wt=javabin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-977) version=2.2 is not necessary for wt=javabin

2009-01-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-977:


Attachment: SOLR-977.patch

 version=2.2 is not necessary for wt=javabin
 ---

 Key: SOLR-977
 URL: https://issues.apache.org/jira/browse/SOLR-977
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Noble Paul
Priority: Trivial
 Attachments: SOLR-977.patch


 CommonsHttpSolrServer can drop the version=2.2 if the wt=javabin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-921) SolrResourceLoader must cache name vs class

2009-01-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-921:


Fix Version/s: 1.4

 SolrResourceLoader must cache name vs class
 ---

 Key: SOLR-921
 URL: https://issues.apache.org/jira/browse/SOLR-921
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-921.patch


 every class that is loaded through SolrResourceLoader does a Class.forName() 
 and when if it is not found a ClassNotFoundExcepton is thrown
 Then , it looks up with the various packages and finds the right class if the 
 name starts with solr. Considering the fact that we usually use this 
 solr.classname format we pay too much of a price for this. After every 
 lookup the result can be cached in a MapString,Class and can be shared 
 across all the cores and this Map can be stored at the CoreContainer level

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-921) SolrResourceLoader must cache name vs class

2009-01-22 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12658754#action_12658754
 ] 

noble.paul edited comment on SOLR-921 at 1/22/09 10:50 PM:
---

The patch currently caches the result only if the default set of packages are 
used . If you pass an extra list of package names , then the result is not 
cached.  The ideal solution is to consider the package names also in the key . 
I have ignored those usecases for simplicity. I have also ignored cases where 
classes are loaded by parent classloader. Ideally the classloader also must be 
considered for making the key for the cache . 

This is useful when cores are loaded/unloaded very frequently and there are a 
large number of cores (tens of thousands) . In other cases the perf benefits 
are negligible. 

When loading plugins they are rarely loaded using the solr.cname .If we use a 
fully qualified name then the ClassNotFoundExceptions are not thrown and the 
cost is low and not worth optimizing.   So I have ignored all such cases 

Caching the classes on a SolrResopurceLoader instance level means one core 
cannot benefit from the 'learnings' of another core.





  was (Author: noble.paul):
The patch currently caches the result only if the default set of packages 
are used . If you pass an extra list of package names , then the result is not 
cached.  The ideal solution is to consider the package names also in the key . 
I have ignored those usecases for simplicity. I have also ignored cases where 
classes are loaded by parent classloader. Ideally the classloader also must be 
considered for making the key for the cache . 

This is useful when cores are loaded/unloaded very frequently and there are a 
large number of cores (tens of thousands) . In other cases the perf benefits 
are negligible. So this is not applicable for loading plugins as well. 




  
 SolrResourceLoader must cache name vs class
 ---

 Key: SOLR-921
 URL: https://issues.apache.org/jira/browse/SOLR-921
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-921.patch


 every class that is loaded through SolrResourceLoader does a Class.forName() 
 and when if it is not found a ClassNotFoundExcepton is thrown
 Then , it looks up with the various packages and finds the right class if the 
 name starts with solr. Considering the fact that we usually use this 
 solr.classname format we pay too much of a price for this. After every 
 lookup the result can be cached in a MapString,Class and can be shared 
 across all the cores and this Map can be stored at the CoreContainer level

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-977) version=2.2 is not necessary for wt=javabin

2009-01-22 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-977:


Fix Version/s: 1.4

 version=2.2 is not necessary for wt=javabin
 ---

 Key: SOLR-977
 URL: https://issues.apache.org/jira/browse/SOLR-977
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Reporter: Noble Paul
Priority: Trivial
 Fix For: 1.4

 Attachments: SOLR-977.patch


 CommonsHttpSolrServer can drop the version=2.2 if the wt=javabin

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-921) SolrResourceLoader must cache name vs class

2009-01-22 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12666432#action_12666432
 ] 

Shalin Shekhar Mangar commented on SOLR-921:


Hoss, the use-case is for a server with very large number of cores with cores 
being loaded/unloaded all the time.

For your concern #1 -- The code does not cache if the package list passed to 
the method is different from the default list of packages (which are always 
loaded by the webapp classloader. So these can be shared by all cores.

On #2 -- When you put custom classes in $solr_home/lib, they have different 
packages from Solr's own packages. So one would most likely put the fully 
qualified class name. In that case the caching won't happen.

The only problem right now is that if you want to override a class supplied by 
Solr by adding a jar to the $solr_home/lib, it won't take precedence. This can 
be fixed easily before we commit.

 SolrResourceLoader must cache name vs class
 ---

 Key: SOLR-921
 URL: https://issues.apache.org/jira/browse/SOLR-921
 Project: Solr
  Issue Type: Improvement
Reporter: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-921.patch


 every class that is loaded through SolrResourceLoader does a Class.forName() 
 and when if it is not found a ClassNotFoundExcepton is thrown
 Then , it looks up with the various packages and finds the right class if the 
 name starts with solr. Considering the fact that we usually use this 
 solr.classname format we pay too much of a price for this. After every 
 lookup the result can be cached in a MapString,Class and can be shared 
 across all the cores and this Map can be stored at the CoreContainer level

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.