Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-23 Thread didier deshommes
On Wed, Apr 22, 2015 at 4:17 PM, Yonik Seeley ysee...@gmail.com wrote:

 On Wed, Apr 22, 2015 at 11:00 AM, didier deshommes dfdes...@gmail.com
 wrote:
  curl 
 
 http://localhost:8983/solr/gettingstarted/select?wt=jsonindent=trueq=foundation
 
  -H Content-type:application/json

 You're telling Solr the body encoding is JSON, but then you don't send any
 body.
 We could catch that error earlier perhaps, but it still looks like an
 error?


Agreed, it's still an error. But the traceback looks like something
horrible has happened to Solr, and is not particularly informative to the
user. An error message like Empty request body would help.

I suspect that this issue about content-type may come up again in libraries
that interact with Solr since its behavior pre-5.1 was to just ignore empty
body requests and return the response anyway.

-Yonik



Re: Bad contentType for search handler :text/xml; charset=UTF-8

2015-04-22 Thread didier deshommes
A similar problem seems to happen when sending application/json to the
search handler. Solr returns a NullPointerException for some reason:

vagrant@precise64:~/solr-5.1.0$ curl 
http://localhost:8983/solr/gettingstarted/select?wt=jsonindent=trueq=foundation;
-H Content-type:application/json
{
  responseHeader:{
status:500,
QTime:2,
params:{
  indent:true,
  json:,
  q:foundation,
  wt:json}},
  error:{
trace:java.lang.NullPointerException\n\tat
org.apache.solr.request.json.ObjectUtil$ConflictHandler.mergeMap(ObjectUtil.java:60)\n\tat
org.apache.solr.request.json.ObjectUtil.mergeObjects(ObjectUtil.java:114)\n\tat
org.apache.solr.request.json.RequestUtil.mergeJSON(RequestUtil.java:259)\n\tat
org.apache.solr.request.json.RequestUtil.processParams(RequestUtil.java:176)\n\tat
org.apache.solr.util.SolrPluginUtils.setDefaults(SolrPluginUtils.java:166)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:140)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
java.lang.Thread.run(Thread.java:745)\n,
code:500}}

On Wed, Apr 22, 2015 at 9:41 AM, Walter Underwood wun...@wunderwood.org
wrote:

 text/xml is not a safe content-type, because of the way that HTTP handles
 charsets. Always use application/xml.

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)


 On Apr 22, 2015, at 3:01 AM, bengates benga...@aliceadsl.fr wrote:

  Looks like Solarium hardcodes a default header Content-Type: text/xml;
  charset=utf-8 if none provided.
  Removing it solves the problem.
 
  It seems that Solr 5.1 doesn't support this content-type.
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Bad-contentType-for-search-handler-text-xml-charset-UTF-8-tp4200314p4201579.html
  Sent from the Solr - User mailing list archive at Nabble.com.




Re: solr cloud does not start with many collections

2015-03-06 Thread didier deshommes
It would be a huge step forward if one could have several hundreds of Solr
collections, but only have a small portion of them opened/loaded at the
same time. This is similar to ElasticSearch's close index api, listed here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-open-close.html
. I've opened an issue to implement the same in Solr here a few months ago:
https://issues.apache.org/jira/browse/SOLR-6399

On Thu, Mar 5, 2015 at 4:42 PM, Damien Kamerman dami...@gmail.com wrote:

 I've tried a few variations, with 3 x ZK, 6 X nodes, solr 4.10.3, solr 5.0
 without any success and no real difference. There is a tipping point at
 around 3,000-4,000 cores (varies depending on hardware) from where I can
 restart the cloud OK within ~4min, to the cloud not working and
 continuous 'conflicting
 information about the leader of shard' warnings.

 On 5 March 2015 at 14:15, Shawn Heisey apa...@elyograg.org wrote:

  On 3/4/2015 5:37 PM, Damien Kamerman wrote:
   I'm running on Solaris x86, I have plenty of memory and no real limits
   # plimit 15560
   15560:  /opt1/jdk/bin/java -d64 -server -Xss512k -Xms32G -Xmx32G
   -XX:MaxMetasp
  resource  current maximum
 time(seconds) unlimited   unlimited
 file(blocks)  unlimited   unlimited
 data(kbytes)  unlimited   unlimited
 stack(kbytes) unlimited   unlimited
 coredump(blocks)  unlimited   unlimited
 nofiles(descriptors)  65536   65536
 vmemory(kbytes)   unlimited   unlimited
  
   I've been testing with 3 nodes, and that seems OK up to around 3,000
  cores
   total. I'm thinking of testing with more nodes.
 
  I have opened an issue for the problems I encountered while recreating a
  config similar to yours, which I have been doing on Linux.
 
  https://issues.apache.org/jira/browse/SOLR-7191
 
  It's possible that the only thing the issue will lead to is improvements
  in the documentation, but I'm hopeful that there will be code
  improvements too.
 
  Thanks,
  Shawn
 
 


 --
 Damien Kamerman



Re: Unload collection in SolrCloud

2014-08-20 Thread didier deshommes
I added a JIRA issue here: https://issues.apache.org/jira/browse/SOLR-6399


On Thu, May 22, 2014 at 4:16 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Age out in this context is just implementing a LRU cache for open
 cores. When the cache limit is exceeded, the oldest core is closed
 automatically.

 Best,
 Erick

 On Thu, May 22, 2014 at 10:27 AM, Saumitra Srivastav
 saumitra.srivast...@gmail.com wrote:
  Eric,
 
  Can you elaborate more on what you mean by age out?
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Unload-collection-in-SolrCloud-tp4135706p4137707.html
  Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unload collection in SolrCloud

2014-05-22 Thread didier deshommes
On Thu, May 22, 2014 at 10:30 AM, Erick Erickson erickerick...@gmail.comwrote:

 If we manage to extend the lazy core loading from stand-alone to
 lazy collection loading in SolrCloud would that satisfy the
 use-case? It still doesn't allow manual unloading of the collection,
 but the large collection would age out if it was truly not used all
 that much. That said, I don't know if there's active work in this
 direction right now.


This is a nice option to have but having the ability to manually load and
unload a collection is still needed. For example, if you're doing analytics
work and storing a day's data in a collection, you still want to be able to
access day 32 and day 64 even if you keep 30 days of data loaded in memory.

I also think having a manual option would allow people more flexibility in
how they manage the number of collections they keep loaded.





 Best,
 Erick

 On Thu, May 22, 2014 at 5:35 AM, Saumitra Srivastav
 saumitra.srivast...@gmail.com wrote:
  Yes, that's what I am doing.
 
  IMO in addition to search, Solr satisfies the needs of lot of analytics
  applications as well, and on-demand loading is a common use case in
  analytics(to keep TCO low), so it would be nice to keep this supported.
 
 
  Regards,
  Saumitra
 
 
 
  On Thu, May 22, 2014 at 5:37 PM, Shalin Shekhar Mangar [via Lucene] 
  ml-node+s472066n4137630...@n3.nabble.com wrote:
 
  Ah, I see. So if I understand it correctly, you are sharing the cluster
  with other collections which are more frequently used and you want to
 keep
  resources available for them so you keep your collection dormant most of
  the time until requested.
 
  No, we don't have such an API. It'd be cool to have a lazy loaded
  collection though. Thank you for describing the use-case because the way
  that we're moving towards (ZK as truth etc.), the core admin APIs will
  gradually be phased out and satisfying your use-case would become
  impossible. Let me think more on this.
 
 
  On Thu, May 22, 2014 at 4:57 PM, Saumitra Srivastav 
  [hidden email] http://user/SendEmail.jtp?type=nodenode=4137630i=0
  wrote:
 
   I don't want to delete the collection/shards. I just want to unload
 all
   shards/replica of the collection temporarily.
  
   Let me explain my use case.
  
   I have a collection alias say *collectionA* which consists of n
   collections(n=5) each with 8 shards and 2 replica over a 16 machine
   cluster.
   *collectionA* is quite big in size and used very rarely, so we keep
 all
   shards/replica of *collectionA* unloaded most of the time. Only when
  user
   request to use it, we load it in memory. To load/unload shards/replica
  of
   aliased *collectionA*, we use CLUSTERSTATUS api to get list of all
   shards/replicas in aliased collection and then use CORE ADMIN api to
   load/unload them.
  
   As you can see there is lot of manual work involved, so I want to know
  if
   there is an API to load/unload ALL shards/replicas of a collection?
  
  
   Regards,
   Saumitra
  
  
   On Thu, May 22, 2014 at 4:36 PM, Shalin Shekhar Mangar [via Lucene] 
   [hidden email] http://user/SendEmail.jtp?type=nodenode=4137630i=1
 
  wrote:
  
You can use the delete Collection API.
   
   
   
  
 
 https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api6
   
   
   
On Thu, May 22, 2014 at 3:56 PM, Saumitra Srivastav 
[hidden email] 
 http://user/SendEmail.jtp?type=nodenode=4137608i=0
 
wrote:
   
 Guys, any suggestions for this??



 --
 View this message in context:

   
  
 
 http://lucene.472066.n3.nabble.com/Unload-collection-in-SolrCloud-tp4135706p4137602.html
 Sent from the Solr - User mailing list archive at Nabble.com.

   
   
   
--
Regards,
Shalin Shekhar Mangar.
   
   
--
 If you reply to this email, your message will be added to the
  discussion
below:
   
   
  
 
 http://lucene.472066.n3.nabble.com/Unload-collection-in-SolrCloud-tp4135706p4137608.html
 To start a new topic under Solr - User, email
[hidden email] 
 http://user/SendEmail.jtp?type=nodenode=4137630i=2
To unsubscribe from Unload collection in SolrCloud, click here
  
  
.
NAML
  
 
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
   
   
  
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Unload-collection-in-SolrCloud-tp4135706p4137612.html
 
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 
 
  --
   If you reply to this email, your message will be added to the
 

Re: Solrcloud - adding a node as a replica?

2013-09-19 Thread didier deshommes
Thanks Furkan,
That's exactly what I was looking for.


On Wed, Sep 18, 2013 at 4:21 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Are yoh looking for that:

 http://lucene.472066.n3.nabble.com/SOLR-Cloud-Collection-Management-quesiotn-td4063305.html

 18 Eylül 2013 Çarşamba tarihinde didier deshommes dfdes...@gmail.com
 adlı
 kullanıcı şöyle yazdı:
  Hi,
  How do I add a node as a replica to a solrcloud cluster? Here is my
  situation: some time ago, I created several collections
  with replicationFactor=2. Now I need to add a new replica. I thought just
  starting a new node and re-using the same zokeeper instance would make it
  automatically a replica, but that isn't the case. Do I need to delete and
  re-create my collections with the right replicationFactor (3 in this
 case)
  again? I am using solr 4.3.0.
 
  Thanks,
  didier
 



Solrcloud - adding a node as a replica?

2013-09-18 Thread didier deshommes
Hi,
How do I add a node as a replica to a solrcloud cluster? Here is my
situation: some time ago, I created several collections
with replicationFactor=2. Now I need to add a new replica. I thought just
starting a new node and re-using the same zokeeper instance would make it
automatically a replica, but that isn't the case. Do I need to delete and
re-create my collections with the right replicationFactor (3 in this case)
again? I am using solr 4.3.0.

Thanks,
didier


Re: Collection - loadOnStartup

2013-08-05 Thread didier deshommes
For Solr 4.3.0, I don't think you can pass loadOnStartup to the Collections
API, although the Cores API accepts it. That's been my experience anyway.


On Mon, Aug 5, 2013 at 6:27 AM, Srivatsan ranjith.venkate...@gmail.comwrote:

 No errors in zookeeper and solr.  I m using CloudSolrServer for creating
 collections as said above.I just want to set loadOnStartup to false for
 cores in solr.xml. I dont want all cores to loadonstartup. Hence when
 creating collection, i m trying to set this parameter to false. But still i
 m getting same value for loadOnStartup in solr.xml



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Collection-loadOnStartup-tp4082531p4082546.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-08 Thread didier deshommes
Any idea on this? I still cannot get the combination of transient cores and
transientCacheSize to work as I think it should: give me the ability to
create a large number cores and automatically load and unload them for me
based on a limit that I set.

If anyone else is using this feature and it is working for you, let me know
how you got it working!


On Fri, May 3, 2013 at 2:11 PM, didier deshommes dfdes...@gmail.com wrote:


 On Fri, May 3, 2013 at 11:18 AM, Erick Erickson 
 erickerick...@gmail.comwrote:

 The cores aren't loaded (or at least shouldn't be) for getting the status.
 The _names_ of the cores should be returned, but those are (supposed) to
 be
 retrieved from a list rather than loaded cores. So are you sure that's
 not what
 you are seeing? How are you determining whether the cores are actually
 loaded
 or not?


 I'm looking at the output of :

 $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status;

 cores that are loaded have a startTime and upTime value. Cores that
 are unloaded don't appear in the output at all. For example, I created 3
 transient cores with transientCacheSize=2 . When I asked for a list of
 all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
 back 2 cores when I asked for the list again.

 It would be nice if cores had a isTransient and a isCurrentlyLoaded
 value so that one could see exactly which cores are loaded.




 That said, it's perfectly possible that the status command is doing
 something we
 didn't anticipate, but I took a quick look at the code (got to rush to a
 plane)
 and CoreAdminHandler _appears_ to be just returning whatever info it can
 about an unloaded core for status. I _think_ you'll get more info if the
 core has ever been loaded though, even though if it's been removed from
 the transient cache. Ditto for the create action.

 So let's figure out whether you're really seeing loaded cores or not, and
 then
 raise a JIRA if so...

 Thanks for reporting!
 Erick

 On Thu, May 2, 2013 at 1:27 PM, didier deshommes dfdes...@gmail.com
 wrote:
  Hi,
  I've been very interested in the transient core feature of solr to
 manage a
  large number of cores. I'm especially interested in this use case, that
 the
  wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
  now):
 
 loadOnStartup=false transient=true: This is really the use-case. There
 are
  a large number of cores in your system that are short-duration use. You
  want Solr to load them as necessary, but unload them when the cache gets
  full on an LRU basis.
 
  I'm creating 10 transient core via core admin like so
 
  $ curl 
 
 http://localhost:8983/solr/admin/cores?wt=jsonaction=CREATEname=new_core2instanceDir=collection1/dataDir=new_core2transient=trueloadOnStartup=false
  
 
  and have transientCacheSize=2 in my solr.xml file, which I take means
 I
  should have at most 2 transient cores loaded at any time. The problem is
  that these cores are still loaded when when I ask solr to list cores:
 
  $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status;
 
  From the explanation in the wiki, it looks like solr would manage
 loading
  and unloading transient cores for me without having to worry about them,
  but this is not what's happening.
 
  The situation is different when I restart solr; it does the right
 thing
  by loading the maximum cores set by transientCacheSize. When I add more
  cores, the old behavior happens again, where all created transient cores
  are loaded in solr.
 
  I'm using the development branch lucene_solr_4_3 to run my example. I
 can
  open a jira if need be.





Re: transientCacheSize doesn't seem to have any effect, except on startup

2013-05-03 Thread didier deshommes
On Fri, May 3, 2013 at 11:18 AM, Erick Erickson erickerick...@gmail.comwrote:

 The cores aren't loaded (or at least shouldn't be) for getting the status.
 The _names_ of the cores should be returned, but those are (supposed) to be
 retrieved from a list rather than loaded cores. So are you sure that's not
 what
 you are seeing? How are you determining whether the cores are actually
 loaded
 or not?


I'm looking at the output of :

$ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status;

cores that are loaded have a startTime and upTime value. Cores that are
unloaded don't appear in the output at all. For example, I created 3
transient cores with transientCacheSize=2 . When I asked for a list of
all cores, all 3 cores were returned. I explicitly unloaded 1 core and got
back 2 cores when I asked for the list again.

It would be nice if cores had a isTransient and a isCurrentlyLoaded
value so that one could see exactly which cores are loaded.




 That said, it's perfectly possible that the status command is doing
 something we
 didn't anticipate, but I took a quick look at the code (got to rush to a
 plane)
 and CoreAdminHandler _appears_ to be just returning whatever info it can
 about an unloaded core for status. I _think_ you'll get more info if the
 core has ever been loaded though, even though if it's been removed from
 the transient cache. Ditto for the create action.

 So let's figure out whether you're really seeing loaded cores or not, and
 then
 raise a JIRA if so...

 Thanks for reporting!
 Erick

 On Thu, May 2, 2013 at 1:27 PM, didier deshommes dfdes...@gmail.com
 wrote:
  Hi,
  I've been very interested in the transient core feature of solr to
 manage a
  large number of cores. I'm especially interested in this use case, that
 the
  wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
  now):
 
 loadOnStartup=false transient=true: This is really the use-case. There
 are
  a large number of cores in your system that are short-duration use. You
  want Solr to load them as necessary, but unload them when the cache gets
  full on an LRU basis.
 
  I'm creating 10 transient core via core admin like so
 
  $ curl 
 
 http://localhost:8983/solr/admin/cores?wt=jsonaction=CREATEname=new_core2instanceDir=collection1/dataDir=new_core2transient=trueloadOnStartup=false
  
 
  and have transientCacheSize=2 in my solr.xml file, which I take means I
  should have at most 2 transient cores loaded at any time. The problem is
  that these cores are still loaded when when I ask solr to list cores:
 
  $ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status;
 
  From the explanation in the wiki, it looks like solr would manage loading
  and unloading transient cores for me without having to worry about them,
  but this is not what's happening.
 
  The situation is different when I restart solr; it does the right thing
  by loading the maximum cores set by transientCacheSize. When I add more
  cores, the old behavior happens again, where all created transient cores
  are loaded in solr.
 
  I'm using the development branch lucene_solr_4_3 to run my example. I can
  open a jira if need be.



transientCacheSize doesn't seem to have any effect, except on startup

2013-05-02 Thread didier deshommes
Hi,
I've been very interested in the transient core feature of solr to manage a
large number of cores. I'm especially interested in this use case, that the
wiki lists at http://wiki.apache.org/solr/LotsOfCores (looks to be down
now):

loadOnStartup=false transient=true: This is really the use-case. There are
a large number of cores in your system that are short-duration use. You
want Solr to load them as necessary, but unload them when the cache gets
full on an LRU basis.

I'm creating 10 transient core via core admin like so

$ curl 
http://localhost:8983/solr/admin/cores?wt=jsonaction=CREATEname=new_core2instanceDir=collection1/dataDir=new_core2transient=trueloadOnStartup=false


and have transientCacheSize=2 in my solr.xml file, which I take means I
should have at most 2 transient cores loaded at any time. The problem is
that these cores are still loaded when when I ask solr to list cores:

$ curl http://localhost:8983/solr/admin/cores?wt=jsonaction=status;

From the explanation in the wiki, it looks like solr would manage loading
and unloading transient cores for me without having to worry about them,
but this is not what's happening.

The situation is different when I restart solr; it does the right thing
by loading the maximum cores set by transientCacheSize. When I add more
cores, the old behavior happens again, where all created transient cores
are loaded in solr.

I'm using the development branch lucene_solr_4_3 to run my example. I can
open a jira if need be.


Re: transientCacheSize not working

2013-03-22 Thread didier deshommes
I've created an issue and patch here that makes it possible to specify
transient and loadOnStatup on core creation:
https://issues.apache.org/jira/browse/SOLR-4631


On Wed, Mar 20, 2013 at 10:14 AM, didier deshommes dfdes...@gmail.comwrote:

 Thanks. Is there a way to pass loadOnStartup and/or transient as
 parameters to the core admin http api? This doesn't seem to work: curl
 http://localhost:8983/solr/admin/cores?action=CREATEtransient=truename=c1


 On Tue, Mar 19, 2013 at 7:29 PM, Mark Miller markrmil...@gmail.comwrote:

 I don't think SolrCloud works with the transient stuff.

 - Mark

 On Mar 19, 2013, at 8:04 PM, didier deshommes dfdes...@gmail.com wrote:

  Hi,
  I cannot get Solrcloud to respect transientCacheSize when creating
 multiple
  cores via the web api. I'm runnig solr 4.2 like this:
 
  java -Dbootstrap_confdir=./solr/collection1/conf
  -Dcollection.configName=conf1 -DzkRun -DnumShards=1 -jar start.jar
 
  I'm creating multiple cores via the core admin http api:
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp1
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp2
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp3
 
  My solr.xml looks like:
 
  ?xml version=1.0 encoding=UTF-8 ?
  solr persistent=true
   cores transientCacheSize=2 adminPath=/admin/cores
 shareSchema=true
  zkClientTimeout=${zkClientTimeout:15000} hostPort=8983
  hostContext=solr
 /cores
  /solr
 
  When I list all cores currently loaded, via curl
  http://localhost:8983/solr/admin/cores?action=status , I notice that
 all 3
  cores are still running, even though transientCacheSize is 2. Can anyone
  tell me why that is?
 
  Also, is there a way to pass loadOnStartup and transient to the core
 admin
  http api? Specifying these when creating a core doesn't seem to work:
 curl
  http://localhost:8983/solr/admin/cores?action=CREATEtransient=true
 
  Thanks,
  didier





Re: transientCacheSize not working

2013-03-20 Thread didier deshommes
Thanks. Is there a way to pass loadOnStartup and/or transient as parameters
to the core admin http api? This doesn't seem to work: curl
http://localhost:8983/solr/admin/cores?action=CREATEtransient=truename=c1


On Tue, Mar 19, 2013 at 7:29 PM, Mark Miller markrmil...@gmail.com wrote:

 I don't think SolrCloud works with the transient stuff.

 - Mark

 On Mar 19, 2013, at 8:04 PM, didier deshommes dfdes...@gmail.com wrote:

  Hi,
  I cannot get Solrcloud to respect transientCacheSize when creating
 multiple
  cores via the web api. I'm runnig solr 4.2 like this:
 
  java -Dbootstrap_confdir=./solr/collection1/conf
  -Dcollection.configName=conf1 -DzkRun -DnumShards=1 -jar start.jar
 
  I'm creating multiple cores via the core admin http api:
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp1
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp2
  curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp3
 
  My solr.xml looks like:
 
  ?xml version=1.0 encoding=UTF-8 ?
  solr persistent=true
   cores transientCacheSize=2 adminPath=/admin/cores
 shareSchema=true
  zkClientTimeout=${zkClientTimeout:15000} hostPort=8983
  hostContext=solr
 /cores
  /solr
 
  When I list all cores currently loaded, via curl
  http://localhost:8983/solr/admin/cores?action=status , I notice that
 all 3
  cores are still running, even though transientCacheSize is 2. Can anyone
  tell me why that is?
 
  Also, is there a way to pass loadOnStartup and transient to the core
 admin
  http api? Specifying these when creating a core doesn't seem to work:
 curl
  http://localhost:8983/solr/admin/cores?action=CREATEtransient=true
 
  Thanks,
  didier




transientCacheSize not working

2013-03-19 Thread didier deshommes
Hi,
I cannot get Solrcloud to respect transientCacheSize when creating multiple
cores via the web api. I'm runnig solr 4.2 like this:

java -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=conf1 -DzkRun -DnumShards=1 -jar start.jar

I'm creating multiple cores via the core admin http api:
curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp1
curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp2
curl http://localhost:8983/solr/admin/cores?action=CREATEname=tmp3

My solr.xml looks like:

?xml version=1.0 encoding=UTF-8 ?
solr persistent=true
  cores transientCacheSize=2 adminPath=/admin/cores shareSchema=true
zkClientTimeout=${zkClientTimeout:15000} hostPort=8983
hostContext=solr
/cores
/solr

When I list all cores currently loaded, via curl
http://localhost:8983/solr/admin/cores?action=status , I notice that all 3
cores are still running, even though transientCacheSize is 2. Can anyone
tell me why that is?

Also, is there a way to pass loadOnStartup and transient to the core admin
http api? Specifying these when creating a core doesn't seem to work: curl
http://localhost:8983/solr/admin/cores?action=CREATEtransient=true

Thanks,
didier


Re: Cache replication

2011-08-10 Thread didier deshommes
Consider putting a cache (memcached, redis, etc) *in front* of your
solr slaves. Just make sure to update it when replication occurs.

didier

On Tue, Aug 9, 2011 at 6:07 PM, arian487 akarb...@tagged.com wrote:
 I'm wondering if the caches on all the slaves are replicated across (such as
 queryResultCache).  That is to say, if I hit one of my slaves and cache a
 result, and I make a search later and that search happens to hit a different
 slave, will that first cached result be available for use?

 This is pretty important because I'm going to have a lot of slaves and if
 this isn't done, then I'd have a high chance of running a lot uncached
 queries.

 Thanks :)

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3240708.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: QTime Solr Query

2011-02-10 Thread didier deshommes
On Thu, Feb 10, 2011 at 4:08 PM, Stijn Vanhoorelbeke
stijn.vanhoorelb...@gmail.com wrote:
 Hi,

 I've done some stress testing onto my solr system ( running in the ec2 cloud
 ).
 From what I've noticed during the tests, the QTime drops to just 1 or 2 ms (
 on a index of ~2 million documents ).

 My first thought pointed me to the different Solr caches; so I've disabled
 all of them. Yet QTime stays low.
 Then the Lucence internal Field Cache came into sight. This cache is hidden
 deep into Lucence and is not configurable trough Solr.

 To cope with this I thought I would lower the memory allocated to Solr -
 that way a smaller cache is forced.
 But yet QTime stays low.

When stress-testing Solr, I usually flush the OS cache also. This is
the command to do it on linux:

# sync; echo 3  /proc/sys/vm/drop_caches

didier

 Can Solr be so fast to retrieve queries in just 1/2 ms - even if I only
 allocate 100 Mb to Solr?



multiple cores, solr.xml and replication

2010-10-21 Thread didier deshommes
Hi there,
I noticed that the java-based replication does not make replication of
multiple core  automatic. For example, if I have a master with 7
cores, any slave I set up has to explicitly know about each of the 7
cores to be able to replicate them. This information is stored in
solr.xml, and since this file is out of the conf/ directory, it's
impossible to make the java-based replication copy this file over each
slave. Is this by design? For those of you  doing multicore
replication, how do you handle it?

Is overwriting solr.xml when persist=true is used thread-safe? What
happens if I create 2 different cores at the same time? I ask because
I have 7 cores total and I always end with only 2 or 3 cores in my
solr.xml after doing a bulk delta-import across cores.

didier


Re: multiple cores, solr.xml and replication

2010-10-21 Thread didier deshommes
On Thu, Oct 21, 2010 at 3:00 PM, Shawn Heisey s...@elyograg.org wrote:
 On 10/21/2010 1:42 PM, didier deshommes wrote:

 I noticed that the java-based replication does not make replication of
 multiple core automatic. For example, if I have a master with 7
 cores, any slave I set up has to explicitly know about each of the 7
 cores to be able to replicate them. This information is stored in
 solr.xml, and since this file is out of the conf/ directory, it's
 impossible to make the java-based replication copy this file over each
 slave. Is this by design? For those of you  doing multicore
 replication, how do you handle it?

 My slave replication handler looks like this, used for all cores.  The
 solr.core.name parameter is dynamically replaced with the name of the
 current core:

I use this configuration too but doesn't this assume that solr.xml is
the same in master and slave? what happens when master creates a new
core?

didier


 requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=slave
 str
 name=masterUrlhttp://HOST:8983/solr/${solr.core.name}/replication/str
 str name=pollInterval00:00:15/str
 /lst
 /requestHandler

 Shawn




Re: Jetty rerturning HTTP error code 413

2010-08-18 Thread didier deshommes
Hi Alexandre,
Have you tried setting a higher headerBufferSize?  Look in
etc/jetty.xml and search for 'headerBufferSize'; I think it controls
the size of the url. By default it is 8192.

didier

On Wed, Aug 18, 2010 at 2:43 PM, Alexandre Rocco alel...@gmail.com wrote:
 Guys,

 We are facing an issue executing very large query (~4000 bytes in the URL)
 in Solr.
 When we execute the query, Solr (probably Jetty) returns a HTTP 413 error
 (FULL HEAD).

 I guess that this is related to the very big query being executed, and
 currently we can't make it short.
 Is there any configuration that need to be tweaked on Jetty or other
 component to make this query work?

 Any advice is really appreciated.

 Thanks!
 Alexandre Rocco



Re: help finding illegal chars in XML doc

2010-07-18 Thread didier deshommes
For xml 1.1 documents, you can view if any of your documents have
these restricted characters defined here:
http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-RestrictedChar

If they are, you'll have to remove them.

didier

On Sun, Jul 18, 2010 at 11:16 AM, robert mena robert.m...@gmail.com wrote:
 Hi,

 I am doing some tests with solr 1.4.1.

 I've created a XML file with the documents I'd like to index.   With a few
 items (1000) everything went fine.

 When I went to a more representative import (around 6) I got error

 java -jar example/exampledocs/post.jar doc.xml
 SimplePostTool: version 1.2
 SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8,
 other encodings are not currently supported
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file add.xml
 SimplePostTool: FATAL: Solr returned an error:
 Illegal_character_CTRLCHAR_code_27__at_rowcol_unknownsource_37022847

 I've tried to track where this problem is located without luck.

 Any ideas?



Re: how to get tf-idf values in solr

2010-06-15 Thread didier deshommes
Have you taken a look at Solr's TermVector component? It's probably
what you want:

http://wiki.apache.org/solr/TermVectorComponent

didier

On Tue, Jun 15, 2010 at 8:38 AM, sarfaraz masood
sarfarazmasood2...@yahoo.com wrote:
 I am Sarfaraz, working on a Search Engine
 project which is based on Nutch  Solr. I am trying to implement a
 new Search Algorithm for this engine.

 Our search engine is crawling the web and storing the documents in form of 
 large strings in the database indexed by their urls.

 Now
 to implement my algorithm i need tf - idf values(0 - 1) for each
 document given by the crawler. but i m unable to find any method in
 solr or lucene which can serve my purpose.

 For my algorithm i need to maintain a relevance matrix of the following type :

 eg
     term1   term2    term3    term4...
 url1    0.7   0.8
  0.3    0.1
 url2    0.4   0.1   0.4   0.5
 url3

 .
 .
 .
 and
 for this purpose i need a core java method/function in solr that
 returns me the tf idf values for all terms in all documents for the
 available document list..

 Plz help

 I will highly grateful to you all

 -Sarfaraz Masood




Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread didier deshommes
On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
matthieu_lab...@yahoo.com wrote:
 What I am trying to understand is the search/filter algorithm. If I have 1 
 core with all documents and I  search for Paris for userId=123, is lucene 
 going to first search for all Paris documents and then apply a filter on the 
 userId ? If this is the case, then I am better off having a specific index 
 for the user=123 because this will be faster

If you want to apply the filter to userid first, use filter queries
(http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
filter by userid first then search for Paris.

didier






 --- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:

 From: Marc Sturlese marc.sturl...@gmail.com
 Subject: Re: Multiple Cores Vs. Single Core for the following use case
 To: solr-user@lucene.apache.org
 Date: Wednesday, January 27, 2010, 2:22 AM


 In case you are going to use core per user take a look to this patch:
 http://wiki.apache.org/solr/LotsOfCores

 Trey-13 wrote:

 Hi Matt,

 In most cases you are going to be better off going with the userid method
 unless you have a very small number of users and a very large number of
 docs/user. The userid method will likely be much easier to manage, as you
 won't have to spin up a new core every time you add a new user.  I would
 start here and see if the performance is good enough for your requirements
 before you start worrying about it not being efficient.

 That being said, I really don't have any idea what your data looks like.
 How many users do you have?  How many documents per user?  Are any
 documents
 shared by multiple users?

 -Trey



 On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:

 Hi



 Shall I set up Multiple Core or Single core for the following use case:



 I have X number of users.



 When I do a search, I always know for which user I am doing a search



 Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
 add
 a userId field to each document?



 If I choose the 1 core solution then I am concerned with performance.
 Let's say I search for NewYork ... If lucene returns all New York
 matches for all users and then filters based on the userId, then this
 is going to be less efficient than if I have sharded per user and send
 the request for New York to the user's core



 Thank you for your help



 matt










 --
 View this message in context: 
 http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
 Sent from the Solr - User mailing list archive at Nabble.com.







Re: solr perf

2009-12-21 Thread didier deshommes
Have you tried loading solr instances as you need them and unloading
those that are not being used? I wish I could help more, I don't know
many people running that many use cores.

didier

On Sun, Dec 20, 2009 at 2:38 PM, Matthieu Labour matth...@strateer.com wrote:
 Hi
 I have a slr instance in which i created 700 core. 1 Core per user of my
 application.
 The total size of the data indexed on disk is 35GB with solr cores going
 from 100KB and few documents to 1.2GB and 50 000 documents.
 Searching seems very slow and indexing as well
 This is running on a EC2 xtra large instance (6CPU, 15GB Memory, Raid0 disk)
 I would appreciate if anybody has some tips, articles etc... as what to do
 to understand and improve performance
 Thank you



Re: question about merging indexes

2009-10-26 Thread didier deshommes
On Sun, Oct 25, 2009 at 1:15 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I need some help about the mergeindex command. I have 2 cores A and B
 : that I want to merge into a new index RES. A has 100 docs and B 10
 : docs. All of B's docs are from A, except that one attribute is
 : changed. The goal is to bring the updated attributes from B into A.

 that's not how mergeindex works ... merging two indexes is essentially
 just adding all the docs from one index into the other (but w/o the
 reindexing step - it works by copying the raw term info)

 There is no way to modify a doc once it's been indexed.

Oh thanks. This is definitely not what I need then. Thanks for the
clarification!

didier


 : When I issue the mergeindexes command  my RES core only has 10 docs. I
 : expect RES to have 100  or even 110 docs, but 10 is very puzzling. Am
 : I misunderstanding something about merging indexes?

 what exactly was the command you used to do the merge?  you should have
 gotten 110 docs.



 -Hoss




question about merging indexes

2009-10-20 Thread didier deshommes
Hi there,
I need some help about the mergeindex command. I have 2 cores A and B
that I want to merge into a new index RES. A has 100 docs and B 10
docs. All of B's docs are from A, except that one attribute is
changed. The goal is to bring the updated attributes from B into A.
When I issue the mergeindexes command  my RES core only has 10 docs. I
expect RES to have 100  or even 110 docs, but 10 is very puzzling. Am
I misunderstanding something about merging indexes?

What I really want to do is to be able to merge 2 cores, but it looks
like this is still in the works (solr-1331). Thanks!

didier


indexing frequently-changing fields

2009-10-08 Thread didier deshommes
I am using Solr to index data in a SQL database.  Most of the data
doesn't change after initial commit, except for a single boolean field
that indicates whether an item is flagged as 'needing attention'.  So
I have a need_attention field in the database that I update whenever a
user marks an item as needing attention in my UI.  The problem I have
is that I want to offer the ability to include need_attention in my
user's queries, but do not want to incur the expense of having to
reindex whenever this flag changes on an individual document.

I have thought about different solutions to this problem, including
using multi-core and having a smaller core for recently-marked items
that I am willing to do 'near-real-time' commits on.  Are there are
any common solutions to this problem, which I have to imagine is
common in this community?


OutOfMemoryError due to auto-warming

2009-09-24 Thread didier deshommes
Hi there,
We are running solr and allocating  1GB to it and we keep having
OutOfMemoryErrors. We get messages like this:

Error during auto-warming of
key:org.apache.solr.search.queryresult...@c785194d:java.lang.OutOfMemoryError:
Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3209)
at java.lang.String.lt;initgt;(String.java:216)
at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
at 
org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169)
at 
org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:701)
at 
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208)
at 
org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676)
at 
org.apache.solr.search.MissingLastOrdComparator.setNextReader(MissingStringLastComparatorSource.java:181)
at 
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:252)
at org.apache.lucene.search.Searcher.search(Searcher.java:173)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
at 
org.apache.solr.search.SolrIndexSearcher.access$000(SolrIndexSearcher.java:51)
at 
org.apache.solr.search.SolrIndexSearcher$3.regenerateItem(SolrIndexSearcher.java:332)
at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
at 
org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481)
at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1154)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

And like this:
   Error during auto-warming of
key:org.apache.solr.search.queryresult...@33cf792:java.lang.OutOfMemoryError:
Java heap space

We've searched and one suggestion was to reduce the size of the
various caches that do sorting in solrconfig.xml
(http://osdir.com/ml/solr-user.lucene.apache.org/2009-05/msg01043.html).
Does this solution generally work?  Can anyone think of any other
cause for this problem?

didier


Re: OutOfMemoryError due to auto-warming

2009-09-24 Thread didier deshommes
On Thu, Sep 24, 2009 at 5:40 PM, Francis Yakin fya...@liquid.com wrote:
 You also can increase the JVM HeapSize if you have enough physical memory, 
 like for example if you have 4GB physical, gives the JVM heapsize 2GB or 
 2.5GB.

Thanks,
we can definitely do that (we have 4GB available). I also forgot to
add that we're running a development version of solr (git clone from ~
3 weeks ago).

Thanks,
didier


 Francis

 -Original Message-
 From: didier deshommes [mailto:dfdes...@gmail.com]
 Sent: Thursday, September 24, 2009 3:32 PM
 To: solr-user@lucene.apache.org
 Cc: Andrew Montalenti
 Subject: OutOfMemoryError due to auto-warming

 Hi there,
 We are running solr and allocating  1GB to it and we keep having
 OutOfMemoryErrors. We get messages like this:

 Error during auto-warming of
 key:org.apache.solr.search.queryresult...@c785194d:java.lang.OutOfMemoryError:
 Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:3209)
        at java.lang.String.lt;initgt;(String.java:216)
        at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
        at 
 org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169)
        at 
 org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:701)
        at 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208)
        at 
 org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676)
        at 
 org.apache.solr.search.MissingLastOrdComparator.setNextReader(MissingStringLastComparatorSource.java:181)
        at 
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94)
        at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:252)
        at org.apache.lucene.search.Searcher.search(Searcher.java:173)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
        at 
 org.apache.solr.search.SolrIndexSearcher.access$000(SolrIndexSearcher.java:51)
        at 
 org.apache.solr.search.SolrIndexSearcher$3.regenerateItem(SolrIndexSearcher.java:332)
        at org.apache.solr.search.LRUCache.warm(LRUCache.java:194)
        at 
 org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481)
        at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1154)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

 And like this:
   Error during auto-warming of
 key:org.apache.solr.search.queryresult...@33cf792:java.lang.OutOfMemoryError:
 Java heap space

 We've searched and one suggestion was to reduce the size of the
 various caches that do sorting in solrconfig.xml
 (http://osdir.com/ml/solr-user.lucene.apache.org/2009-05/msg01043.html).
 Does this solution generally work?  Can anyone think of any other
 cause for this problem?

 didier