Re: Lucene native facets

2013-04-29 Thread William Bell
https://issues.apache.org/jira/browse/SOLR-4774




On Fri, Apr 26, 2013 at 6:30 AM, Jack Krupansky j...@basetechnology.comwrote:

 Sure, but they are completely different conceptual models of faceting -
 Solr is dynamic, based on the actual data for the hierarchy, while Lucene
 is static, based on a predefined taxonomy that must be meticulously created
 before any data is added.

 Solr answers the question: what structure does your data have, while
 Lucene answers the question how does your data fit into a predefined
 structure. Both are valid and valuable questions, but they are still rather
 distinct.

 Yes, Solr should provide support for static facet taxonomies, but what
 exactly that would look like... has not even been proposed yet, yet alone
 as simple as facet.lucene=true.

 OTOH, maybe most of the work may be simply to add taxonomy management to
 Solr (as a passthrough to the Lucene features), and then maybe a lot of the
 existing Solr facet parameters simply need parallel Lucene-oriented
 implementations.

 But, the other half of Solr facets is how filter queries are used for
 selecting facets. That's all done at the application level, so it can't be
 hidden from the app so easily. Maybe a new Solr facet filter API can be
 developed that can then in turn have Solr facet vs. Lucene facet
 implementations. Or, maybe a new dynamic facet Lucene API could be added as
 well, so that Solr facets in fact become a passthrough as well.

 Still, it would be good to support Lucene facets in Solr. Maybe that could
 be one of the key turning points for what defines Lucene/Solr 5.0.

 Is there a Jira for this? I don't recall one.

 -- Jack Krupansky

 -Original Message- From: William Bell
 Sent: Friday, April 26, 2013 4:01 AM
 To: solr-user@lucene.apache.org
 Subject: Lucene native facets


 Since facets are now included in Lucene, why don't we add a pass through
 from Solr? The current facet code can live on but we could create new param
 like facet.lucene=true?

 Seems like a great enhancement !


 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Abhishek Sanoujam
We have a solr core with about 115 million documents. We are trying to 
migrate data and running a simple query with *:* query and with start 
and rows param.
The performance is becoming too slow in solr, its taking almost 2 mins 
to get 4000 rows and migration is being just too slow. Logs snippet below:


INFO: [coreName] webapp=/solr path=/select 
params={start=55438000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=168308
INFO: [coreName] webapp=/solr path=/select 
params={start=55446000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=122771
INFO: [coreName] webapp=/solr path=/select 
params={start=55454000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=137615
INFO: [coreName] webapp=/solr path=/select 
params={start=5545q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=141223
INFO: [coreName] webapp=/solr path=/select 
params={start=55462000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=97474
INFO: [coreName] webapp=/solr path=/select 
params={start=55458000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=98115
INFO: [coreName] webapp=/solr path=/select 
params={start=55466000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=143822
INFO: [coreName] webapp=/solr path=/select 
params={start=55474000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=118066
INFO: [coreName] webapp=/solr path=/select 
params={start=5547q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=121498
INFO: [coreName] webapp=/solr path=/select 
params={start=55482000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=164062
INFO: [coreName] webapp=/solr path=/select 
params={start=55478000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=165518
INFO: [coreName] webapp=/solr path=/select 
params={start=55486000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=118163
INFO: [coreName] webapp=/solr path=/select 
params={start=55494000q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=141642
INFO: [coreName] webapp=/solr path=/select 
params={start=5549q=*:*wt=javabinversion=2rows=4000} 
hits=115760479 status=0 QTime=145037



I've taken some thread dumps in the solr server and most of the time the 
threads seem to be busy in the following stacks mostly:
Is there anything that can be done to improve the performance? Is it a 
known issue? Its very surprising that querying for some just rows 
starting at some points is taking in order of minutes.



395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a 
runnable [0x7f42865dd000]

   java.lang.Thread.State: RUNNABLE
at 
org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)

at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184)
at 
org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61)
at 
org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)



1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 
runnable [0x7f42869e1000]

   java.lang.Thread.State: RUNNABLE
at 
org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
at 
org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:210)
at 
org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:62)

at org.apache.lucene.search.Scorer.score(Scorer.java:64)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:605)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1491)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457)
at 

Re: Update on shards

2013-04-29 Thread Arkadi Colson
Anyone else having this problem that an update needs to go to a host 
where a shard exists.


java version 1.7.0_17
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Solr 4.2.1

apache-tomcat-7.0.33

Thx!

Met vriendelijke groeten

Arkadi Colson

Smartbit bvba • Hoogstraat 13 • 3670 Meeuwen
T +32 11 64 08 80 • F +32 11 64 08 81

On 04/25/2013 09:18 AM, Arkadi Colson wrote:

Hi

It seems not to work in my case. We are using the solr php module for 
talking to Solr. Currently we have 2 collections 'intradesk' and 'lvs' 
for 10 solr hosts (shards: 5 - repl: 2). Because there is no more disc 
space I created 6 new hosts for collection 'messages' (shards: 3 - 
repl: 2).


'intradesk + lvs':
solr01-dcg
solr01-gs
solr02-dcg
solr02-gs
solr03-dcg
solr03-gs
solr04-dcg
solr04-gs
solr05-dcg
solr05-gs

'messages':
solr06-dcg
solr06-gs
solr07-dcg
solr07-gs
solr08-dcg
solr08-gs

So when doing a select, I can talk to any host. When updating I must 
talk to a host with at least 1 shard on it.


I created the new messages shard with the following command to get 
them on the new hosts (06 - 08): 
http://solr01-dcg.intnet.smartbit.be:8983/solr/admin/collections?action=CREATEname=messagesnumShards=3replicationFactor=2collection.configName=smsccreateNodeSet=solr06-gs.intnet.smartbit.be:8983_solr,solr06-dcg.intnet.smartbit.be:8983_solr,solr07-gs.intnet.smartbit.be:8983_solr,solr07-dcg.intnet.smartbit.be:8983_solr,solr08-gs.intnet.smartbit.be:8983_solr,solr08-dcg.intnet.smartbit.be:8983_solr 



They are all in the same config set 'smsc'.

Below is the code:

$client = new SolrClient(
array(
'hostname' = solr01-dcg.intnet.smartbit.be 
http://solr01-dcg.intnet.smartbit.be:8983/solr/admin/collections?action=CREATEname=messagesnumShards=3replicationFactor=2collection.configName=smsccreateNodeSet=solr06-gs.intnet.smartbit.be:8983_solr,solr06-dcg.intnet.smartbit.be:8983_solr,solr07-gs.intnet.smartbit.be:8983_solr,solr07-dcg.intnet.smartbit.be:8983_solr,solr08-gs.intnet.smartbit.be:8983_solr,solr08-dcg.intnet.smartbit.be:8983_solr,

'port' = 8983,
'login' = ***,
'password' = ***,
'path' = solr/messages,
'wt' = json
)
);

$doc = new SolrInputDocument();
$doc-addField('id', $uniqueID);
$doc-addField('smsc_ssid', $ssID);
$doc-addField('smsc_module', $i['module']);
$doc-addField('smsc_modulekey', $i['moduleKey']);
$doc-addField('smsc_courseid', $courseID);
$doc-addField('smsc_description', $i['description']);
$doc-addField('smsc_content', $i['content']);
$doc-addField('smsc_lastdate', $lastdate);
$doc-addField('smsc_userid', $userID);

$client-addDocument($doc);

The exception I get look like this:
exception 'SolrClientException' with message 'Unsuccessful update 
request. Response Code 200. (null)'


Nothing special to find in the solr log.

Any idea?


Arkadi

On 04/24/2013 08:43 PM, Mark Miller wrote:
Sorry - need to correct myself - updates worked the same as read 
requests - they also needed to hit a SolrCore in order to get 
forwarded to the right node. I was not thinking clearly when I said 
this applied to just reads and not writes. Both needed a SolrCore to 
do their work - with the request proxying, this is no longer the 
case, so you can hit Solr instances with no SolrCores or with 
SolrCores that are not part of the collection you are working with, 
and both read and write side requests are now proxied to a suitable 
node that has a SolrCore that can do the search or forward the update 
(or accept the update).


- Mark

On Apr 23, 2013, at 3:38 PM, Mark Miller markrmil...@gmail.com wrote:


We have a 3rd release candidate for 4.3 being voted on now.

I have never tested this feature with Tomcat - only Jetty. Users 
have reported it does not work with Tomcat. That leads one to think 
it may have a problem in other containers as well.


A previous contributor donated a patch that explicitly flushes a 
stream in our proxy code - he says this allows the feature to work 
with Tomcat. I committed this feature - the flush can't hurt, and 
given the previous contributions of this individual, I'm fairly 
confident the fix makes things work in Tomcat. I have no first hand 
knowledge that it does work though.


You might take the RC for a spin and test it our yourself: 
http://people.apache.org/~simonw/staging_area/lucene-solr-4.3.0-RC3-rev1470846/


- Mark

On Apr 23, 2013, at 3:20 PM, Furkan KAMACI furkankam...@gmail.com 
wrote:



Hi Mark;

All in all you say that when 4.3 is tagged at repository (I mean 
when it is

ready) this feature will work for Tomcat too at a stable version?


2013/4/23 Mark Miller markrmil...@gmail.com


On Apr 23, 2013, at 2:49 PM, Shawn Heisey s...@elyograg.org wrote:


What exactly is the 'request proxying' thing that doesn't work on
tomcat? Is this something different from basic SolrCloud operation 
where
you send any kind of request to any server and they get directed 
where they

need to go? I haven't heard of that not working on tomcat 

Re: Shard update error when using DIH

2013-04-29 Thread heaven
Hi, seems like I have exactly the same error:

Apr 28, 2013 11:41:57 PM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.UnsupportedOperationException
at
org.apache.lucene.queries.function.FunctionValues.longVal(FunctionValues.java:46)
at
org.apache.solr.update.VersionInfo.getVersionFromIndex(VersionInfo.java:201)
at
org.apache.solr.update.UpdateLog.lookupVersion(UpdateLog.java:714)
at
org.apache.solr.update.VersionInfo.lookupVersion(VersionInfo.java:184)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:567)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:346)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:365)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:948)
at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:679)

My setup is http://oi43.tinypic.com/1fco40.jpg

Am I missing something during configuration process?

Best,
Alex



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059729.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Yes, here is the full schema: http://pastebin.com/pFPbD749[1]

On Mon, Apr 29, 2013 at 10:01 AM, heaven [hidden email][2] wrote: 







*If you reply to this email, your message will be added to the discussion 
below:* 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059732.html[3]
 
To unsubscribe from Shard update error when using DIH, click here[4].

NAML[5] 




[1] http://pastebin.com/pFPbD749
[2] /user/SendEmail.jtp?type=nodenode=4059732i=0
[3] 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059732.html
[4] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe
_by_codenode=4035502code=YWhlYXZlbjg3QGdtYWlsLmNvbXw0MDM1NTAyfDE3
MDI0ODI4OTY=
[5] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_view
erid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.B
asicNamespace-nabble.view.web.template.NabbleNamespace-
nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21
nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-
send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059733.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Re: Shard update error when using DIH

2013-04-29 Thread Raymond Wiker
You have

field name=_version_ type=string indexed=true stored=true
multiValued=false /

--- I think this needs to be long.


Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Jan Høydahl
Hi,

How many shards do you have? This is a known issue with deep paging with multi 
shard, see https://issues.apache.org/jira/browse/SOLR-1726

You may be more successful in going to each shard, one at a time (with 
distrib=false) to avoid this issue.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com:

 We have a solr core with about 115 million documents. We are trying to 
 migrate data and running a simple query with *:* query and with start and 
 rows param.
 The performance is becoming too slow in solr, its taking almost 2 mins to get 
 4000 rows and migration is being just too slow. Logs snippet below:
 
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=168308
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=122771
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=137615
 INFO: [coreName] webapp=/solr path=/select 
 params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=141223
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=97474
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=98115
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=143822
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=118066
 INFO: [coreName] webapp=/solr path=/select 
 params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=121498
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=164062
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=165518
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=118163
 INFO: [coreName] webapp=/solr path=/select 
 params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=141642
 INFO: [coreName] webapp=/solr path=/select 
 params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479 
 status=0 QTime=145037
 
 
 I've taken some thread dumps in the solr server and most of the time the 
 threads seem to be busy in the following stacks mostly:
 Is there anything that can be done to improve the performance? Is it a known 
 issue? Its very surprising that querying for some just rows starting at some 
 points is taking in order of minutes.
 
 
 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a 
 runnable [0x7f42865dd000]
   java.lang.Thread.State: RUNNABLE
at 
 org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184)
at 
 org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61)
at 
 org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499)
at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366)
at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457)
at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 
 
 1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 
 runnable [0x7f42869e1000]
   java.lang.Thread.State: RUNNABLE
at 
 org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
at 
 org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:210)
at 
 org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:62)
at org.apache.lucene.search.Scorer.score(Scorer.java:64)
at 
 

Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Got these errors after switching the field type to long:
 *  *crm-test:* 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Unknown fieldtype 'long' specified on field _version_ 
 *  *crm-prod:* 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Unknown fieldtype 'long' specified on field _version_ 
 *  *crm-dev:* 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
Unknown fieldtype 'long' specified on field _version_ 

Monday 29 April 2013, you wrote:


You have 

field name=_version_ type=string indexed=true stored=true 
multiValued=false / 

--- I think this needs to be long. 





*If you reply to this email, your message will be added to the discussion 
below:* 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059735.html[1]
 
To unsubscribe from Shard update error when using DIH, click here[2].

NAML[3] 




[1] 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059735.html
[2] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe
_by_codenode=4035502code=YWhlYXZlbjg3QGdtYWlsLmNvbXw0MDM1NTAyfDE3
MDI0ODI4OTY=
[3] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_view
erid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.B
asicNamespace-nabble.view.web.template.NabbleNamespace-
nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21
nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-
send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059739.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread Gora Mohanty
On 29 April 2013 14:55, heaven aheave...@gmail.com wrote:
 Got these errors after switching the field type to long:
  *  *crm-test:*
 org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 Unknown fieldtype 'long' specified on field _version_

You have probably edited your schema. The default one has
fieldType name=long class=solr.TrieLongField precisionStep=0
omitNorms=true positionIncrementGap=0/
towards the top of the file.

Regards,
Gora


Re: Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Whoops, yes, that works.
Will check if that helped to fix the original error now.

Monday 29 April 2013, you wrote:


On 29 April 2013 14:55, heaven [hidden email][1] wrote:  Got these errors 
after switching the field type to long:   *  *crm-test:*  
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:  
Unknown fieldtype 'long' specified on field _version_ 

You have probably edited your schema. The default one has fieldType 
name=long class=solr.TrieLongField precisionStep=0 omitNorms=true 
positionIncrementGap=0/ towards the top of the file. 

Regards, Gora 





*If you reply to this email, your message will be added to the discussion 
below:* 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html[2]
 
To unsubscribe from Shard update error when using DIH, click here[3].

NAML[4] 




[1] /user/SendEmail.jtp?type=nodenode=4059740i=0
[2] 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059740.html
[3] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe
_by_codenode=4035502code=YWhlYXZlbjg3QGdtYWlsLmNvbXw0MDM1NTAyfDE3
MDI0ODI4OTY=
[4] 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_view
erid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.B
asicNamespace-nabble.view.web.template.NabbleNamespace-
nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21
nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-
send_instant_email%21nabble%3Aemail.naml




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Shard-update-error-when-using-DIH-tp4035502p4059743.html
Sent from the Solr - User mailing list archive at Nabble.com.

Issue regarding Indexing PDFs into Solr.

2013-04-29 Thread Krishna Venkateswaran
Hi

I have installed Solr over Apache Tomcat.
I have used Apache Tomcat v6.x for Solr to work.

When trying to upload a file using SolrJ to index it into Solr, I am
getting an exception as follows:

Server at http://localhost:8080/solr-example returned non ok status:500,
message:Internal Server Error

When I looked up at the internet, I saw that the jars location were issue
and hence I changed them too.
But even then I am still getting this exception.

Can you help me in this regard?

I am also adding the logs from Catalina.out below:



Apr 28, 2013 4:22:05 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: lazy loading error
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:258)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:679)
Caused by: org.apache.solr.common.SolrException: Error loading class
'solr.extraction.ExtractingRequestHandler'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:440)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249)
... 17 more
Caused by: java.lang.ClassNotFoundException:
solr.extraction.ExtractingRequestHandler
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:615)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:266)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:424)
... 20 more

Thanks and Regards
Krishna


Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Dmitry Kan
Jan,

Would the same distrib=false help for distributed faceting? We are running
into a similar issue with facet paging.

Dmitry



On Mon, Apr 29, 2013 at 11:58 AM, Jan Høydahl jan@cominvent.com wrote:

 Hi,

 How many shards do you have? This is a known issue with deep paging with
 multi shard, see https://issues.apache.org/jira/browse/SOLR-1726

 You may be more successful in going to each shard, one at a time (with
 distrib=false) to avoid this issue.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com:

  We have a solr core with about 115 million documents. We are trying to
 migrate data and running a simple query with *:* query and with start and
 rows param.
  The performance is becoming too slow in solr, its taking almost 2 mins
 to get 4000 rows and migration is being just too slow. Logs snippet below:
 
  INFO: [coreName] webapp=/solr path=/select
 params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=168308
  INFO: [coreName] webapp=/solr path=/select
 params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=122771
  INFO: [coreName] webapp=/solr path=/select
 params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=137615
  INFO: [coreName] webapp=/solr path=/select
 params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=141223
  INFO: [coreName] webapp=/solr path=/select
 params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=97474
  INFO: [coreName] webapp=/solr path=/select
 params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=98115
  INFO: [coreName] webapp=/solr path=/select
 params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=143822
  INFO: [coreName] webapp=/solr path=/select
 params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=118066
  INFO: [coreName] webapp=/solr path=/select
 params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=121498
  INFO: [coreName] webapp=/solr path=/select
 params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=164062
  INFO: [coreName] webapp=/solr path=/select
 params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=165518
  INFO: [coreName] webapp=/solr path=/select
 params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=118163
  INFO: [coreName] webapp=/solr path=/select
 params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=141642
  INFO: [coreName] webapp=/solr path=/select
 params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479
 status=0 QTime=145037
 
 
  I've taken some thread dumps in the solr server and most of the time the
 threads seem to be busy in the following stacks mostly:
  Is there anything that can be done to improve the performance? Is it a
 known issue? Its very surprising that querying for some just rows starting
 at some points is taking in order of minutes.
 
 
  395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a
 runnable [0x7f42865dd000]
java.lang.Thread.State: RUNNABLE
 at
 org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
 at
 org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184)
 at
 org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61)
 at
 org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366)
 at
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457)
 at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410)
 at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 
 
  1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956
 runnable [0x7f42869e1000]
java.lang.Thread.State: RUNNABLE
 at
 org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
 at
 

Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Abhishek Sanoujam
We have a single shard, and all the data is in a single box only. 
Definitely looks like deep-paging is having problems.


Just to understand, is the searcher looping over the result set 
everytime and skipping the first start count? This will definitely 
take a toll when we reach higher start values.




On 4/29/13 2:28 PM, Jan Høydahl wrote:

Hi,

How many shards do you have? This is a known issue with deep paging with multi 
shard, see https://issues.apache.org/jira/browse/SOLR-1726

You may be more successful in going to each shard, one at a time (with 
distrib=false) to avoid this issue.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com:


We have a solr core with about 115 million documents. We are trying to migrate 
data and running a simple query with *:* query and with start and rows param.
The performance is becoming too slow in solr, its taking almost 2 mins to get 
4000 rows and migration is being just too slow. Logs snippet below:

INFO: [coreName] webapp=/solr path=/select 
params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=168308
INFO: [coreName] webapp=/solr path=/select 
params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=122771
INFO: [coreName] webapp=/solr path=/select 
params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=137615
INFO: [coreName] webapp=/solr path=/select 
params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=141223
INFO: [coreName] webapp=/solr path=/select 
params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=97474
INFO: [coreName] webapp=/solr path=/select 
params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=98115
INFO: [coreName] webapp=/solr path=/select 
params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=143822
INFO: [coreName] webapp=/solr path=/select 
params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=118066
INFO: [coreName] webapp=/solr path=/select 
params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=121498
INFO: [coreName] webapp=/solr path=/select 
params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=164062
INFO: [coreName] webapp=/solr path=/select 
params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=165518
INFO: [coreName] webapp=/solr path=/select 
params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=118163
INFO: [coreName] webapp=/solr path=/select 
params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=141642
INFO: [coreName] webapp=/solr path=/select 
params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479 
status=0 QTime=145037


I've taken some thread dumps in the solr server and most of the time the 
threads seem to be busy in the following stacks mostly:
Is there anything that can be done to improve the performance? Is it a known 
issue? Its very surprising that querying for some just rows starting at some 
points is taking in order of minutes.


395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable 
[0x7f42865dd000]
   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184)
at 
org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61)
at 
org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)


1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 runnable 
[0x7f42869e1000]
   java.lang.Thread.State: RUNNABLE
at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252)
at 

Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Dmitry Kan
Abhishek,

There is a wiki regarding this:

http://wiki.apache.org/solr/CommonQueryParameters

search pageDoc and pageScore.


On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
abhi.sanou...@gmail.comwrote:

 We have a single shard, and all the data is in a single box only.
 Definitely looks like deep-paging is having problems.

 Just to understand, is the searcher looping over the result set everytime
 and skipping the first start count? This will definitely take a toll when
 we reach higher start values.




 On 4/29/13 2:28 PM, Jan Høydahl wrote:

 Hi,

 How many shards do you have? This is a known issue with deep paging with
 multi shard, see 
 https://issues.apache.org/**jira/browse/SOLR-1726https://issues.apache.org/jira/browse/SOLR-1726

 You may be more successful in going to each shard, one at a time (with
 distrib=false) to avoid this issue.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com
 :

  We have a solr core with about 115 million documents. We are trying to
 migrate data and running a simple query with *:* query and with start and
 rows param.
 The performance is becoming too slow in solr, its taking almost 2 mins
 to get 4000 rows and migration is being just too slow. Logs snippet below:

 INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308
 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771
 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615
 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223
 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474
 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115
 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822
 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066
 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498
 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062
 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=165518
 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118163
 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141642
 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=145037


 I've taken some thread dumps in the solr server and most of the time the
 threads seem to be busy in the following stacks mostly:
 Is there anything that can be done to improve the performance? Is it a
 known issue? Its very surprising that querying for some just rows starting
 at some points is taking in order of minutes.


 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a
 runnable [0x7f42865dd000]
java.lang.Thread.State: RUNNABLE
 at org.apache.lucene.util.**PriorityQueue.downHeap(**
 PriorityQueue.java:252)
 at org.apache.lucene.util.**PriorityQueue.pop(**
 PriorityQueue.java:184)
 at org.apache.lucene.search.**TopDocsCollector.**
 populateResults(**TopDocsCollector.java:61)
 at org.apache.lucene.search.**TopDocsCollector.topDocs(**
 TopDocsCollector.java:156)
 at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(**
 SolrIndexSearcher.java:1499)
 at org.apache.solr.search.**SolrIndexSearcher.getDocListC(**
 SolrIndexSearcher.java:1366)
 at org.apache.solr.search.**SolrIndexSearcher.search(**
 SolrIndexSearcher.java:457)
 at org.apache.solr.handler.**component.QueryComponent.**
 process(QueryComponent.java:**410)
 at org.apache.solr.handler.**component.SearchHandler.**
 handleRequestBody(**SearchHandler.java:208)
 at org.apache.solr.handler.**RequestHandlerBase.**handleRequest(
 **RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.**execute(SolrCore.java:1817)
 at org.apache.solr.servlet.**SolrDispatchFilter.execute(**
 

Re: Issue regarding Indexing PDFs into Solr.

2013-04-29 Thread Furkan KAMACI
It seems that your solrconfig.xml can not find libraries. Here is an
example path from solrconfig.xml:

lib dir=../../../contrib/extraction/lib regex=.*\.jar /


2013/4/29 Krishna Venkateswaran krish...@usc.edu

 Hi

 I have installed Solr over Apache Tomcat.
 I have used Apache Tomcat v6.x for Solr to work.

 When trying to upload a file using SolrJ to index it into Solr, I am
 getting an exception as follows:

 Server at http://localhost:8080/solr-example returned non ok status:500,
 message:Internal Server Error

 When I looked up at the internet, I saw that the jars location were issue
 and hence I changed them too.
 But even then I am still getting this exception.

 Can you help me in this regard?

 I am also adding the logs from Catalina.out below:



 Apr 28, 2013 4:22:05 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException: lazy loading error
 at

 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:258)
 at

 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
 at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
 at

 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
 at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:679)
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'solr.extraction.ExtractingRequestHandler'
 at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:440)
 at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:518)
 at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:592)
 at

 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249)
 ... 17 more
 Caused by: java.lang.ClassNotFoundException:
 solr.extraction.ExtractingRequestHandler
 at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:615)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:266)
 at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:424)
 ... 20 more

 Thanks and Regards
 Krishna



createNodeSet

2013-04-29 Thread Arkadi Colson
Is it correct that if I create a collection B with parameter 
createNodeSet = hostB and I query on hostA something for collectionA it 
could not be found?


BR,
Arkadi



solr 3.6 hang for few seconds, need help

2013-04-29 Thread mizayah
Hi,

Im running solr 3.6 on tomcat, under some traffic about 20r/s
I got 6 different cores on it.


I was testing one by quering every 1 second with simple request and time
param.


INFO: [core1] webapp=/solr3.4-tomcat path=/select params= ... 1:55:05 ...
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=2
   
commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecd4,version=1331656373519,generation=7387672,filenames=[_1yxp2.fdx,
_4herl.nrm, _2inup_17f5.del, _1yxp2.fd
   
commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecdh,version=1331656373568,generation=7387685,filenames=[_1yxp2.fdx,
_4herl.nrm, _2inup_17f5.del, _1yxp2.fd
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1331656373568
Apr 29, 2013 1:55:06 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit(optimize=false,waitFlush=false,waitSearcher=true,expungeDeletes=false)
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:06 ...
Apr 29, 2013 1:55:14 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:07 ...
Apr 29, 2013 1:55:15 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:15 ...


query from 1:55:15 was executed  at 1:15:14.
Between 1:55:06 - 1:55:17 there is nothing in solr and tomcat logs.


What could hapen here? I'm getting that hang every some time.
Does commiting, or something could stop me from searching?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-3-6-hang-for-few-seconds-need-help-tp4059760.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr 3.6 hang for few seconds, need help

2013-04-29 Thread Christian von Wendt-Jensen
I'm experiencing the same issue in my setup.

If you do not see any logging for several seconds, then it _could_ be due to 
garbage collection. If you experience heavy traffic and have very large caches, 
then the JVM might be forced to do a full garbage collection from time to time, 
halting all processes. In that case your caches might be too big, and you 
should experiment with decreasing their size. You should be able to profile the 
JVM to monitor garbage collection.



Med venlig hilsen / Best Regards

Christian von Wendt-Jensen
IT Team Lead, Customer Solutions

Infopaq International A/S
Kgs. Nytorv 22
DK-1050 København K

Phone +45 36 99 00 00
Mobile +45 31 17 10 07
Email  
christian.sonne.jen...@infopaq.commailto:christian.sonne.jen...@infopaq.com
Webwww.infopaq.comhttp://www.infopaq.com/








DISCLAIMER:
This e-mail and accompanying documents contain privileged confidential 
information. The information is intended only for the recipient(s) named. Any 
unauthorised disclosure, copying, distribution, exploitation or the taking of 
any action in reliance of the content of this e-mail is strictly prohibited. If 
you have received this e-mail in error we would be obliged if you would delete 
the e-mail and attachments and notify the dispatcher by return e-mail or at +45 
36 99 00 00
P Please consider the environment before printing this mail note.

From: mizayah miza...@gmail.commailto:miza...@gmail.com
Reply-To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
Date: Mon, 29 Apr 2013 14:33:35 +0200
To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
Subject: solr 3.6 hang for few seconds, need help

Hi,

Im running solr 3.6 on tomcat, under some traffic about 20r/s
I got 6 different cores on it.


I was testing one by quering every 1 second with simple request and time
param.


INFO: [core1] webapp=/solr3.4-tomcat path=/select params= ... 1:55:05 ...
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=2

commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecd4,version=1331656373519,generation=7387672,filenames=[_1yxp2.fdx,
_4herl.nrm, _2inup_17f5.del, _1yxp2.fd

commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecdh,version=1331656373568,generation=7387685,filenames=[_1yxp2.fdx,
_4herl.nrm, _2inup_17f5.del, _1yxp2.fd
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1331656373568
Apr 29, 2013 1:55:06 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit(optimize=false,waitFlush=false,waitSearcher=true,expungeDeletes=false)
Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:06 ...
Apr 29, 2013 1:55:14 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:07 ...
Apr 29, 2013 1:55:15 PM org.apache.solr.core.SolrCore execute
INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:15 ...


query from 1:55:15 was executed  at 1:15:14.
Between 1:55:06 - 1:55:17 there is nothing in solr and tomcat logs.


What could hapen here? I'm getting that hang every some time.
Does commiting, or something could stop me from searching?






--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-3-6-hang-for-few-seconds-need-help-tp4059760.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Michael Della Bitta
We've found that you can do a lot for yourself by using a filter query
to page through your data if it has a natural range to do so instead
of start and rows.

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com wrote:
 Abhishek,

 There is a wiki regarding this:

 http://wiki.apache.org/solr/CommonQueryParameters

 search pageDoc and pageScore.


 On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
 abhi.sanou...@gmail.comwrote:

 We have a single shard, and all the data is in a single box only.
 Definitely looks like deep-paging is having problems.

 Just to understand, is the searcher looping over the result set everytime
 and skipping the first start count? This will definitely take a toll when
 we reach higher start values.




 On 4/29/13 2:28 PM, Jan Høydahl wrote:

 Hi,

 How many shards do you have? This is a known issue with deep paging with
 multi shard, see 
 https://issues.apache.org/**jira/browse/SOLR-1726https://issues.apache.org/jira/browse/SOLR-1726

 You may be more successful in going to each shard, one at a time (with
 distrib=false) to avoid this issue.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com
 :

  We have a solr core with about 115 million documents. We are trying to
 migrate data and running a simple query with *:* query and with start and
 rows param.
 The performance is becoming too slow in solr, its taking almost 2 mins
 to get 4000 rows and migration is being just too slow. Logs snippet below:

 INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308
 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771
 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615
 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223
 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474
 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115
 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822
 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066
 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498
 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062
 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=165518
 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118163
 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141642
 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:*
 **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=145037


 I've taken some thread dumps in the solr server and most of the time the
 threads seem to be busy in the following stacks mostly:
 Is there anything that can be done to improve the performance? Is it a
 known issue? Its very surprising that querying for some just rows starting
 at some points is taking in order of minutes.


 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a
 runnable [0x7f42865dd000]
java.lang.Thread.State: RUNNABLE
 at org.apache.lucene.util.**PriorityQueue.downHeap(**
 PriorityQueue.java:252)
 at org.apache.lucene.util.**PriorityQueue.pop(**
 PriorityQueue.java:184)
 at org.apache.lucene.search.**TopDocsCollector.**
 populateResults(**TopDocsCollector.java:61)
 at org.apache.lucene.search.**TopDocsCollector.topDocs(**
 TopDocsCollector.java:156)
 at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(**
 SolrIndexSearcher.java:1499)
 at org.apache.solr.search.**SolrIndexSearcher.getDocListC(**
 SolrIndexSearcher.java:1366)
 at org.apache.solr.search.**SolrIndexSearcher.search(**
 SolrIndexSearcher.java:457)
 at 

Re: createNodeSet

2013-04-29 Thread Arkadi Colson

I found this in the zookeeper directory /collections/collectionX/

{
  configName:smsc,
  router:implicit}


Is router:implicit the cause of this? Is it possible to fix?

Thx!

On 04/29/2013 01:24 PM, Arkadi Colson wrote:
Is it correct that if I create a collection B with parameter 
createNodeSet = hostB and I query on hostA something for collectionA 
it could not be found?


BR,
Arkadi








Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Michael Della Bitta
I guess so, you'd have to use a filter query to page through the set
of documents you were faceting against and sum them all at the end.
It's not quite the same operation as paging through results, because
facets are aggregate statistics, but if you're willing to go through
the trouble, I bet it would also help performance.

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 9:06 AM, Dmitry Kan solrexp...@gmail.com wrote:
 Michael,

 Interesting! Do (Can) you apply this to facet searches as well?

 Dmitry


 On Mon, Apr 29, 2013 at 4:02 PM, Michael Della Bitta 
 michael.della.bi...@appinions.com wrote:

 We've found that you can do a lot for yourself by using a filter query
 to page through your data if it has a natural range to do so instead
 of start and rows.

 Michael Della Bitta

 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271

 www.appinions.com

 Where Influence Isn’t a Game


 On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com wrote:
  Abhishek,
 
  There is a wiki regarding this:
 
  http://wiki.apache.org/solr/CommonQueryParameters
 
  search pageDoc and pageScore.
 
 
  On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
  abhi.sanou...@gmail.comwrote:
 
  We have a single shard, and all the data is in a single box only.
  Definitely looks like deep-paging is having problems.
 
  Just to understand, is the searcher looping over the result set
 everytime
  and skipping the first start count? This will definitely take a toll
 when
  we reach higher start values.
 
 
 
 
  On 4/29/13 2:28 PM, Jan Høydahl wrote:
 
  Hi,
 
  How many shards do you have? This is a known issue with deep paging
 with
  multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726
 https://issues.apache.org/jira/browse/SOLR-1726
 
  You may be more successful in going to each shard, one at a time (with
  distrib=false) to avoid this issue.
 
  --
  Jan Høydahl, search solution architect
  Cominvent AS - www.cominvent.com
  Solr Training - www.solrtraining.com
 
  29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam 
 abhi.sanou...@gmail.com
  :
 
   We have a solr core with about 115 million documents. We are trying to
  migrate data and running a simple query with *:* query and with start
 and
  rows param.
  The performance is becoming too slow in solr, its taking almost 2 mins
  to get 4000 rows and migration is being just too slow. Logs snippet
 below:
 
  INFO: [coreName] webapp=/solr path=/select
 params={start=55438000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=168308
  INFO: [coreName] webapp=/solr path=/select
 params={start=55446000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=122771
  INFO: [coreName] webapp=/solr path=/select
 params={start=55454000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=137615
  INFO: [coreName] webapp=/solr path=/select
 params={start=5545q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=141223
  INFO: [coreName] webapp=/solr path=/select
 params={start=55462000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=97474
  INFO: [coreName] webapp=/solr path=/select
 params={start=55458000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=98115
  INFO: [coreName] webapp=/solr path=/select
 params={start=55466000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=143822
  INFO: [coreName] webapp=/solr path=/select
 params={start=55474000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=118066
  INFO: [coreName] webapp=/solr path=/select
 params={start=5547q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=121498
  INFO: [coreName] webapp=/solr path=/select
 params={start=55482000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=164062
  INFO: [coreName] webapp=/solr path=/select
 params={start=55478000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=165518
  INFO: [coreName] webapp=/solr path=/select
 params={start=55486000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=118163
  INFO: [coreName] webapp=/solr path=/select
 params={start=55494000q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=141642
  INFO: [coreName] webapp=/solr path=/select
 params={start=5549q=*:*
  **wt=javabinversion=2rows=**4000} hits=115760479 status=0
 QTime=145037
 
 
  I've taken some thread dumps in the solr server and most of the time
 the
  threads seem to be busy in the following stacks mostly:
  Is there anything that can be done to improve the performance? Is it a
  known issue? Its very surprising that querying for some just rows
 starting
  at some 

Re: createNodeSet

2013-04-29 Thread Michael Della Bitta
That means that documents will be indexed and stored on the node
they're sent to. It shouldn't keep Solr Cloud from loadbalancing
reads. Fixing that won't address the problem you're asking about, but
it may clear up other unintended behaviors.

What version of Solr are you using, and what servlet container?

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 9:20 AM, Arkadi Colson ark...@smartbit.be wrote:
 I found this in the zookeeper directory /collections/collectionX/

 {
   configName:smsc,
   router:implicit}


 Is router:implicit the cause of this? Is it possible to fix?

 Thx!


 On 04/29/2013 01:24 PM, Arkadi Colson wrote:

 Is it correct that if I create a collection B with parameter createNodeSet
 = hostB and I query on hostA something for collectionA it could not be
 found?

 BR,
 Arkadi







Re: createNodeSet

2013-04-29 Thread Arkadi Colson
The strange thing is that I created some time ago 2 other collections 
and there the router:implicit has not been set. Is it possible to create 
a collection withour the router:implicit?


http://solr01:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc 
http://solr01-dcg.intnet.smartbit.be:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc 



VERSIONS

Solr 4.2.1

java version 1.7.0_17
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Met vriendelijke groeten

Arkadi Colson

Smartbit bvba • Hoogstraat 13 • 3670 Meeuwen
T +32 11 64 08 80 • F +32 11 64 08 81

On 04/29/2013 03:24 PM, Michael Della Bitta wrote:

That means that documents will be indexed and stored on the node
they're sent to. It shouldn't keep Solr Cloud from loadbalancing
reads. Fixing that won't address the problem you're asking about, but
it may clear up other unintended behaviors.

What version of Solr are you using, and what servlet container?

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 9:20 AM, Arkadi Colson ark...@smartbit.be wrote:

I found this in the zookeeper directory /collections/collectionX/

{
   configName:smsc,
   router:implicit}


Is router:implicit the cause of this? Is it possible to fix?

Thx!


On 04/29/2013 01:24 PM, Arkadi Colson wrote:

Is it correct that if I create a collection B with parameter createNodeSet
= hostB and I query on hostA something for collectionA it could not be
found?

BR,
Arkadi










Re: Solr performance issues for simple query - q=*:* with start and rows

2013-04-29 Thread Dmitry Kan
Thanks.

Only question is how to smoothly transition to this model. Our facet
(string) fields contain timestamp prefixes, that are reverse ordered
starting from the freshest value. In theory, we could try computing the
filter queries for those. But before doing so, we would need the matched
ids from solr, so it becomes at least 2 pass algorithm?

The biggest concern in general we have with the paging is that the system
seems to pass way more data back and forth, than is needed for computing
the values.


On Mon, Apr 29, 2013 at 4:14 PM, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 I guess so, you'd have to use a filter query to page through the set
 of documents you were faceting against and sum them all at the end.
 It's not quite the same operation as paging through results, because
 facets are aggregate statistics, but if you're willing to go through
 the trouble, I bet it would also help performance.

 Michael Della Bitta

 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271

 www.appinions.com

 Where Influence Isn’t a Game


 On Mon, Apr 29, 2013 at 9:06 AM, Dmitry Kan solrexp...@gmail.com wrote:
  Michael,
 
  Interesting! Do (Can) you apply this to facet searches as well?
 
  Dmitry
 
 
  On Mon, Apr 29, 2013 at 4:02 PM, Michael Della Bitta 
  michael.della.bi...@appinions.com wrote:
 
  We've found that you can do a lot for yourself by using a filter query
  to page through your data if it has a natural range to do so instead
  of start and rows.
 
  Michael Della Bitta
 
  
  Appinions
  18 East 41st Street, 2nd Floor
  New York, NY 10017-6271
 
  www.appinions.com
 
  Where Influence Isn’t a Game
 
 
  On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com
 wrote:
   Abhishek,
  
   There is a wiki regarding this:
  
   http://wiki.apache.org/solr/CommonQueryParameters
  
   search pageDoc and pageScore.
  
  
   On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam
   abhi.sanou...@gmail.comwrote:
  
   We have a single shard, and all the data is in a single box only.
   Definitely looks like deep-paging is having problems.
  
   Just to understand, is the searcher looping over the result set
  everytime
   and skipping the first start count? This will definitely take a
 toll
  when
   we reach higher start values.
  
  
  
  
   On 4/29/13 2:28 PM, Jan Høydahl wrote:
  
   Hi,
  
   How many shards do you have? This is a known issue with deep paging
  with
   multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726
  https://issues.apache.org/jira/browse/SOLR-1726
  
   You may be more successful in going to each shard, one at a time
 (with
   distrib=false) to avoid this issue.
  
   --
   Jan Høydahl, search solution architect
   Cominvent AS - www.cominvent.com
   Solr Training - www.solrtraining.com
  
   29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam 
  abhi.sanou...@gmail.com
   :
  
We have a solr core with about 115 million documents. We are
 trying to
   migrate data and running a simple query with *:* query and with
 start
  and
   rows param.
   The performance is becoming too slow in solr, its taking almost 2
 mins
   to get 4000 rows and migration is being just too slow. Logs snippet
  below:
  
   INFO: [coreName] webapp=/solr path=/select
  params={start=55438000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=168308
   INFO: [coreName] webapp=/solr path=/select
  params={start=55446000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=122771
   INFO: [coreName] webapp=/solr path=/select
  params={start=55454000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=137615
   INFO: [coreName] webapp=/solr path=/select
  params={start=5545q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=141223
   INFO: [coreName] webapp=/solr path=/select
  params={start=55462000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=97474
   INFO: [coreName] webapp=/solr path=/select
  params={start=55458000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=98115
   INFO: [coreName] webapp=/solr path=/select
  params={start=55466000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=143822
   INFO: [coreName] webapp=/solr path=/select
  params={start=55474000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=118066
   INFO: [coreName] webapp=/solr path=/select
  params={start=5547q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=121498
   INFO: [coreName] webapp=/solr path=/select
  params={start=55482000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  QTime=164062
   INFO: [coreName] webapp=/solr path=/select
  params={start=55478000q=*:*
   **wt=javabinversion=2rows=**4000} hits=115760479 status=0
  

Re: Solr 4.2.1 SSLInitializationException

2013-04-29 Thread Sarita Nair


:I'm confused ... it seems that you (or GlassFish) has created a 
:Catch-22...

Glassfish specifies keystore as a system property, but does not require 
specifying the password for the keystore as a system property. 
GF uses a keychain mechanism, which requires the password to be passed from the 
DAS to access the keystore.

:In SolrJ client code you can specify whatever HttpClient implementation 
:you want.  In Solr (for it's use of talking to other nodes in distributed 
:search, which is what is indicated in your stack trace) 
:SystemDefaultHttpClient is hard coded.


Thanks for the clarification. I am using DefaultHttpClient on the client side, 
but was hoping that there is a way around SystemDefaultHttpClient, in Solr. 
Looks like there is isn't any.

Thanks for your help! 







 From: Chris Hostetter hossman_luc...@fucit.org
To: solr-user@lucene.apache.org solr-user@lucene.apache.org; Sarita Nair 
sarita...@yahoo.com 
Sent: Friday, April 12, 2013 3:07 PM
Subject: Re: Solr 4.2.1 SSLInitializationException
 


: Thanks for your response.  As I mentioned in my email, I would prefer 
: the application to not have access to the keystore. Do you know if there 

I'm confused ... it seems that you (or GlassFish) has created a 
Catch-22...

You say you don't want the application to have access to the keystore, but 
aparently you (or glassfish) is explicitly setting javax.net.ssl.keyStore 
to tell the application what keystore to use.  The keystore you specify 
has a password set on it, but you are not telling the application what the 
password is, so it can't use that keystore.

If you don't wnat to application to have access to the keystore at all, 
have you tried unsetting javax.net.ssl.keyStore ?

: is a way of specifying  a different HttpClient implementation (e.g. 
: DefaultHttpClient rather than SystemDefaultHttpClient) ?

In SolrJ client code you can specify whatever HttpClient implementation 
you want.  In Solr (for it's use of talking to other nodes in distributed 
search, which is what is indicated in your stack trace) 
SystemDefaultHttpClient is hard coded.


-Hoss

Re: Atomic Update and stored copy-fields

2013-04-29 Thread Erick Erickson
I'd ask it a different way, why in the world would you store the
destinations of copyFields? It just bloats your index to no good
purpose since all the sources are stored.

As you can tell, I don;t have a good answer for your question, but for
an explicit warning like that, I'd heed it and/or examine the code

Best
Erick

On Fri, Apr 26, 2013 at 3:24 AM, raulgrande83 raulgrand...@hotmail.com wrote:
 Hello everybody,

 We are using last version of Solr (4.2.1) and making some tests on Atomic
 Updates.
 The Solr wiki says that: /(...) requires that all fields in your SchemaXml
 must be configured as stored=true except for fields which are copyField/
 destinations -- which must be configured as stored=false (...)/

 We have all of our fields defined as stored, also those which are
 copyField/ destinations. During our tests we didn't notice anything
 extrange in that fields, Atomic Updates are working fine.
 Why copyField/ destinations must be configured as stored=false since they
 are going to be overwritten by their sources?

 Thank you!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Atomic-Update-and-stored-copy-fields-tp4059129.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Customizing Solr GUI

2013-04-29 Thread Erick Erickson
Give me access to your raw Solr URLs, and I can submit the following:
.../update?commit=tureterm.body=deletequery*:*.query/delete
which will remove all documents from your index. You really have to
take control of the requests you allow to get to Solr...

Best
Erick

On Fri, Apr 26, 2013 at 9:59 AM, Alexandre Rafalovitch
arafa...@gmail.com wrote:
 So, building on this:
 1) Velocity is an option for internal admin interface because it is
 collocated with Solr and therefore does not 'hide' it
 2) Blacklight is the (Rails-based) application layer and the Solr is
 internal behind it, so it does provide the security.

 Hope this helps to understand the distinction.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, Apr 26, 2013 at 12:57 PM, Jack Krupansky
 j...@basetechnology.com wrote:
 Generally, your UI web pages should communicate with your own application
 layer, which in turn communicates with Solr, but you should try to avoid
 having Solr itself visible to the outside world.

 -- Jack Krupansky

 -Original Message- From: kneerosh
 Sent: Friday, April 26, 2013 12:46 PM
 To: solr-user@lucene.apache.org
 Subject: Customizing Solr GUI


 Hi,

  I want to customize Solr gui, and I learnt that the most popular options
 are
 1. Velocity- which is integrated with Solr. The format and options can be
 customized
 2. Project Blacklight

 Pros and cons?

 Secondly I read that one can delete data by just running a delete query in
 the URL. Does either velocity or blacklight provide a way to disable this,
 or provide any kind of security or access control- so that users can only
 browse/search and admins can view the admin screen. How can we handle the
 security aspect in Solr?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Customizing-Solr-GUI-tp4059257.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: relevance when merging results

2013-04-29 Thread Erick Erickson
You cannot rely on scores to be comparable between two queries, or between
two cores with very different kinds of data. Scores are only a way to sort
results within the _same_ query and the _same_ type of core. By type
I mean, say, shards where the schemas are identical and the statical
characteristics of the docs stored on each are very similar.

Sometimes people use separate tabs. Sometimes they merge by some
other characteristic. But nobody I know of has had joy from comparing
scores..

FWIW
Erick

On Fri, Apr 26, 2013 at 10:36 AM, eShard zim...@yahoo.com wrote:
 Hi,
 I'm currently using Solr 4.0 final on tomcat v7.0.3x
 I have 2 cores (let's call them A and B) and I need to combine them as one
 for the UI.
 However we're having trouble on how to best merge these two result sets.
 Currently, I'm using relevancy to do the merge.
 For example,
 I search for red in both cores.
 Core A has a max score of .919856 with 87 results
 Core B has a max score or .6532563 with 30 results

 I would like to simply merge numerically but I don't know if that's valid.
 If I merge in numerical order then Core B results won't appear until element
 25 or later.

 I initially thought about just taking the top 5 results from each and layer
 one on top of the other.

 Is there a best practice out there for merging relevancy?
 Please advise...
 Thanks,




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/relevance-when-merging-results-tp4059275.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Bizarre Solr issue

2013-04-29 Thread Jack.Drysdale.ctr
Hello, everyone.

I have a really bizarre Solr issue that I hope someone can help me resolve.

Production environment is *nix running CF 9.0.0, with both Verity and Solr
collections.

Trying to list collections is breaking - one collection in particular is
breaking the CFCOLLECTION action=list: Error message states that the
solrconfig.xml file cannot be found.

I unregistered this collection via CFAdmin, then went into the file system
and deleted the folders for this collection and restarted both Application
and Solr services. Ran the script, again, and still getting the same error
message for the collection that we just completely removed.  It's NOT being
cached in the browser.

This is working fine in development (Windows environment, CF9.0.1).

Thoughts/suggestions greatly appreciated.


smime.p7s
Description: S/MIME cryptographic signature


Re: createNodeSet

2013-04-29 Thread Arkadi Colson
When I first do a linkconfig the route:implicit seems to be gone! So 
recreating the collection will solve this. The problem that I cannot 
request a collection that does not exists on that host is still there.


Arkadi

On 04/29/2013 03:31 PM, Arkadi Colson wrote:
The strange thing is that I created some time ago 2 other collections 
and there the router:implicit has not been set. Is it possible to 
create a collection withour the router:implicit?


http://solr01:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc 
http://solr01-dcg.intnet.smartbit.be:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc 



VERSIONS

Solr 4.2.1

java version 1.7.0_17
Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Met vriendelijke groeten

Arkadi Colson

Smartbit bvba • Hoogstraat 13 • 3670 Meeuwen
T +32 11 64 08 80 • F +32 11 64 08 81

On 04/29/2013 03:24 PM, Michael Della Bitta wrote:

That means that documents will be indexed and stored on the node
they're sent to. It shouldn't keep Solr Cloud from loadbalancing
reads. Fixing that won't address the problem you're asking about, but
it may clear up other unintended behaviors.

What version of Solr are you using, and what servlet container?

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Mon, Apr 29, 2013 at 9:20 AM, Arkadi Colson ark...@smartbit.be 
wrote:

I found this in the zookeeper directory /collections/collectionX/

{
   configName:smsc,
   router:implicit}


Is router:implicit the cause of this? Is it possible to fix?

Thx!


On 04/29/2013 01:24 PM, Arkadi Colson wrote:
Is it correct that if I create a collection B with parameter 
createNodeSet

= hostB and I query on hostA something for collectionA it could not be
found?

BR,
Arkadi















Re: Bizarre Solr issue

2013-04-29 Thread Alexandre Rafalovitch
Version of Solr would help here. Solr 4+ will log where it find the
collections if enabled (not sure about earlier version). The most
likely problem is related to path. Perhaps you are hardcoding '\'
 separator somewhere on Windows and that messes up the path on Unix.
Or you have different Solr version on dev/prod.

I would probably look for path being mentioned in the logs and, if
that fails, using truss/strace
(http://docstore.mik.ua/orelly/unix2.1/unixnut/c02_236.htm) and just
check where solrconfig.xml is being looked for in reality. It is a
hammer ('when you have a hammer') for sure, but it is often a
faster way to get ground-truth this way and then figure out what's
causing it.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Mon, Apr 29, 2013 at 10:15 AM,  jack.drysdale@ustranscom.mil wrote:
 Hello, everyone.

 I have a really bizarre Solr issue that I hope someone can help me resolve.

 Production environment is *nix running CF 9.0.0, with both Verity and Solr
 collections.

 Trying to list collections is breaking - one collection in particular is
 breaking the CFCOLLECTION action=list: Error message states that the
 solrconfig.xml file cannot be found.

 I unregistered this collection via CFAdmin, then went into the file system
 and deleted the folders for this collection and restarted both Application
 and Solr services. Ran the script, again, and still getting the same error
 message for the collection that we just completely removed.  It's NOT being
 cached in the browser.

 This is working fine in development (Windows environment, CF9.0.1).

 Thoughts/suggestions greatly appreciated.


Re: solr 3.6 hang for few seconds, need help

2013-04-29 Thread Erick Erickson
Garbage collection would be my first guess too. Here's an excellent
article on GC:

http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/

Best
Erick

On Mon, Apr 29, 2013 at 5:56 AM, Christian von Wendt-Jensen
christian.vonwendt-jen...@infopaq.com wrote:
 I'm experiencing the same issue in my setup.

 If you do not see any logging for several seconds, then it _could_ be due to 
 garbage collection. If you experience heavy traffic and have very large 
 caches, then the JVM might be forced to do a full garbage collection from 
 time to time, halting all processes. In that case your caches might be too 
 big, and you should experiment with decreasing their size. You should be able 
 to profile the JVM to monitor garbage collection.



 Med venlig hilsen / Best Regards

 Christian von Wendt-Jensen
 IT Team Lead, Customer Solutions

 Infopaq International A/S
 Kgs. Nytorv 22
 DK-1050 København K

 Phone +45 36 99 00 00
 Mobile +45 31 17 10 07
 Email  
 christian.sonne.jen...@infopaq.commailto:christian.sonne.jen...@infopaq.com
 Webwww.infopaq.comhttp://www.infopaq.com/








 DISCLAIMER:
 This e-mail and accompanying documents contain privileged confidential 
 information. The information is intended only for the recipient(s) named. Any 
 unauthorised disclosure, copying, distribution, exploitation or the taking of 
 any action in reliance of the content of this e-mail is strictly prohibited. 
 If you have received this e-mail in error we would be obliged if you would 
 delete the e-mail and attachments and notify the dispatcher by return e-mail 
 or at +45 36 99 00 00
 P Please consider the environment before printing this mail note.

 From: mizayah miza...@gmail.commailto:miza...@gmail.com
 Reply-To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
 solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 Date: Mon, 29 Apr 2013 14:33:35 +0200
 To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
 solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 Subject: solr 3.6 hang for few seconds, need help

 Hi,

 Im running solr 3.6 on tomcat, under some traffic about 20r/s
 I got 6 different cores on it.


 I was testing one by quering every 1 second with simple request and time
 param.


 INFO: [core1] webapp=/solr3.4-tomcat path=/select params= ... 1:55:05 ...
 Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy onInit
 INFO: SolrDeletionPolicy.onInit: commits:num=2

 commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecd4,version=1331656373519,generation=7387672,filenames=[_1yxp2.fdx,
 _4herl.nrm, _2inup_17f5.del, _1yxp2.fd

 commit{dir=/vol/solr3.4-tomcat/core2/index,segFN=segments_4ecdh,version=1331656373568,generation=7387685,filenames=[_1yxp2.fdx,
 _4herl.nrm, _2inup_17f5.del, _1yxp2.fd
 Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrDeletionPolicy
 updateCommits
 INFO: newest commit = 1331656373568
 Apr 29, 2013 1:55:06 PM org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: start
 commit(optimize=false,waitFlush=false,waitSearcher=true,expungeDeletes=false)
 Apr 29, 2013 1:55:06 PM org.apache.solr.core.SolrCore execute
 INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:06 ...
 Apr 29, 2013 1:55:14 PM org.apache.solr.core.SolrCore execute
 INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:07 ...
 Apr 29, 2013 1:55:15 PM org.apache.solr.core.SolrCore execute
 INFO: [core1] webapp=/solr3.4-tomcat path=/select params=... 1:55:15 ...


 query from 1:55:15 was executed  at 1:15:14.
 Between 1:55:06 - 1:55:17 there is nothing in solr and tomcat logs.


 What could hapen here? I'm getting that hang every some time.
 Does commiting, or something could stop me from searching?






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/solr-3-6-hang-for-few-seconds-need-help-tp4059760.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Current Has A Red Emblem, Slave Has A Higher Version And Doesn't Do Anything To Catch Up Master

2013-04-29 Thread Furkan KAMACI
If you can help me it would be nice. I have tested crawling at my amazon
instances and I have a weird situation:

My slave version is higher than master (actually I have killed my master
and started up it again at some time)

Replication (Slave) Version Gen Size
Master: 1367243029412 49 1.29 GB
Slave: 1367243033892 49 807.42 MB

and at my all instances optimized is green tick however there is a red
emblem at current. Slave is still back (807.42 MB) from master (1.29 GB)
and it doesn't do anything to catch up it. I use Solr 4.2.1 as SolrCloud
with external Zookeeper ensemble.

What should I do?


RE: Bizarre Solr issue

2013-04-29 Thread Jack.Drysdale.ctr
Hello, Alex, and thank you for your reply.

I just looked it up: ColdFusion Server 9 ships with Solr version 1.4.1.  Both 
dev and production environments use the same version.

The script that I wrote takes environment into consideration - with three 
Windows dev environments and one Linux production environment, all paths are 
appropriately hardcoded for folder delimiter using a conditional statement - 
all Windows environments are similar to C:\path\to\collection and the Linux 
environment is hardcoded for /usr/path/to/collection.  Although, I _could_ 
use a / for everything because CF Server automatically converts \ to / 
in paths, I'm so OCD about coding that I use whichever is correct for which 
environment.

How do I access the Solr logs to check?  I'm fairly new to collections.

Thank you,

Jack

-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Monday, April 29, 2013 9:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Bizarre Solr issue

Version of Solr would help here. Solr 4+ will log where it find the 
collections if enabled (not sure about earlier version). The most likely 
problem is related to path. Perhaps you are hardcoding '\'
 separator somewhere on Windows and that messes up the path on Unix.
Or you have different Solr version on dev/prod.

I would probably look for path being mentioned in the logs and, if that fails, 
using truss/strace
(http://docstore.mik.ua/orelly/unix2.1/unixnut/c02_236.htm) and just check 
where solrconfig.xml is being looked for in reality. It is a hammer ('when you 
have a hammer') for sure, but it is often a faster way to get ground-truth 
this way and then figure out what's causing it.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at once. 
Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Mon, Apr 29, 2013 at 10:15 AM,  jack.drysdale@ustranscom.mil wrote:
 Hello, everyone.

 I have a really bizarre Solr issue that I hope someone can help me resolve.

 Production environment is *nix running CF 9.0.0, with both Verity and
 Solr collections.

 Trying to list collections is breaking - one collection in particular
 is breaking the CFCOLLECTION action=list: Error message states that
 the solrconfig.xml file cannot be found.

 I unregistered this collection via CFAdmin, then went into the file
 system and deleted the folders for this collection and restarted both
 Application and Solr services. Ran the script, again, and still
 getting the same error message for the collection that we just
 completely removed.  It's NOT being cached in the browser.

 This is working fine in development (Windows environment, CF9.0.1).

 Thoughts/suggestions greatly appreciated.


smime.p7s
Description: S/MIME cryptographic signature


Re: createNodeSet

2013-04-29 Thread Mark Miller
What version of Solr? That should work in Jetty in 4.2 and not before and in 
Tomcat in 4.3 and not before.

- Mark

On Apr 29, 2013, at 10:19 AM, Arkadi Colson ark...@smartbit.be wrote:

 When I first do a linkconfig the route:implicit seems to be gone! So 
 recreating the collection will solve this. The problem that I cannot request 
 a collection that does not exists on that host is still there.
 
 Arkadi
 
 On 04/29/2013 03:31 PM, Arkadi Colson wrote:
 The strange thing is that I created some time ago 2 other collections and 
 there the router:implicit has not been set. Is it possible to create a 
 collection withour the router:implicit?
 
 http://solr01:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc
  
 http://solr01-dcg.intnet.smartbit.be:8983/solr/admin/collections?action=CREATEname=lvsnumShards=5replicationFactor=2collection.configName=smsc
  
 
 VERSIONS
 
 Solr 4.2.1
 
 java version 1.7.0_17
 Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
 Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
 
 Met vriendelijke groeten
 
 Arkadi Colson
 
 Smartbit bvba • Hoogstraat 13 • 3670 Meeuwen
 T +32 11 64 08 80 • F +32 11 64 08 81
 
 On 04/29/2013 03:24 PM, Michael Della Bitta wrote:
 That means that documents will be indexed and stored on the node
 they're sent to. It shouldn't keep Solr Cloud from loadbalancing
 reads. Fixing that won't address the problem you're asking about, but
 it may clear up other unintended behaviors.
 
 What version of Solr are you using, and what servlet container?
 
 Michael Della Bitta
 
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
 
 www.appinions.com
 
 Where Influence Isn’t a Game
 
 
 On Mon, Apr 29, 2013 at 9:20 AM, Arkadi Colson ark...@smartbit.be wrote:
 I found this in the zookeeper directory /collections/collectionX/
 
 {
   configName:smsc,
   router:implicit}
 
 
 Is router:implicit the cause of this? Is it possible to fix?
 
 Thx!
 
 
 On 04/29/2013 01:24 PM, Arkadi Colson wrote:
 Is it correct that if I create a collection B with parameter createNodeSet
 = hostB and I query on hostA something for collectionA it could not be
 found?
 
 BR,
 Arkadi
 
 
 
 
 
 
 
 
 
 



why does * affect case sensitivity of query results

2013-04-29 Thread geeky2
hello,

environment: solr 3.5


problem statement: when query has * appended, it turns case sensitive.

assumption: query should NOT be case sensitive

actual value in database at time of index: 4387828BULK

here is a snapshot of what works and does not work.

what works:

  itemModelNoExactMatchStr:4387828bULk (and any variation of upper and lower
case letters for *bulk*)

  itemModelNoExactMatchStr:4387828bu*
  itemModelNoExactMatchStr:4387828bul*
  itemModelNoExactMatchStr:4387828bulk*


what does NOT work:

 itemModelNoExactMatchStr:4387828BU*
 itemModelNoExactMatchStr:4387828BUL*
 itemModelNoExactMatchStr:4387828BULK*


below are the specifics of my field and fieldType

  field name=itemModelNoExactMatchStr type=text_exact indexed=true
stored=true/


fieldType name=text_exact class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.TrimFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

thx
mark





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Bizarre Solr issue

2013-04-29 Thread Jack.Drysdale.ctr
I don’t know if this will make any difference, or not, but production is two 
load-balanced servers (as far as I know, both identical).

If I run the script specifically on the first server, it errors as I have 
described.

If I run the script specifically on the second server, it lists the collections 
with no issue, but when I try to index the collection from a query, nothing is 
put into the collection - it will have a size of 0 and a doccount of 0 when the 
script completes.

Jack


smime.p7s
Description: S/MIME cryptographic signature


solr query- get results without scanning files

2013-04-29 Thread dafnashkedy
I would like to execute a solr query and get only the uniquKey I've defined.
The documents are very big so defining fl='my_key' is not fast enough - all
the matching documents are still scanned and the query can take hours (even
though the search itself was fast - numFound takes few seconds to return).
I should mention that all the data is stored, and creating a new index is
not an option.

One idea I had was to get the docIds of the results and map them to my_key
in the code.
I used fl=[docid], thinking it doesn't need scanning to get this info, but
it still takes too long to return.

Is there a better way to get the docIds?
Or a way to unstore certain fields without reindexing?
Or perhapse a compeletly different way to get the results without scanning
all the fields?

Thanks,

Dafna



--
View this message in context: 
http://lucene.472066.n3.nabble.com/solr-query-get-results-without-scanning-files-tp4059798.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does * affect case sensitivity of query results

2013-04-29 Thread Alexandre Rafalovitch
http://wiki.apache.org/solr/MultitermQueryAnalysis

Sorry, not for your version of Solr.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Mon, Apr 29, 2013 at 11:40 AM, geeky2 gee...@hotmail.com wrote:
 hello,

 environment: solr 3.5


 problem statement: when query has * appended, it turns case sensitive.

 assumption: query should NOT be case sensitive

 actual value in database at time of index: 4387828BULK

 here is a snapshot of what works and does not work.

 what works:

   itemModelNoExactMatchStr:4387828bULk (and any variation of upper and lower
 case letters for *bulk*)

   itemModelNoExactMatchStr:4387828bu*
   itemModelNoExactMatchStr:4387828bul*
   itemModelNoExactMatchStr:4387828bulk*


 what does NOT work:

  itemModelNoExactMatchStr:4387828BU*
  itemModelNoExactMatchStr:4387828BUL*
  itemModelNoExactMatchStr:4387828BULK*


 below are the specifics of my field and fieldType

   field name=itemModelNoExactMatchStr type=text_exact indexed=true
 stored=true/


 fieldType name=text_exact class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.TrimFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType

 thx
 mark





 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does * affect case sensitivity of query results

2013-04-29 Thread geeky2
was looking in Smiley's book on page 129 and 130.

from the book,


No text analysis is performed on the search word containing the wildcard,
not even lowercasing. So if you want to find a word starting with Sma, then
sma* is required instead of Sma*, assuming the index side of the field's
type
includes lowercasing. This shortcoming is tracked on SOLR-219. Moreover,
if the field that you want to use the wildcard query on is stemmed in the
analysis, then smashing* would not find the original text Smashing because
the stemming process transforms this to smash. Consequently, don't stem.


thx
mark




--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801p4059812.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: why does * affect case sensitivity of query results

2013-04-29 Thread geeky2
here is the jira link:

https://issues.apache.org/jira/browse/SOLR-219





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-does-affect-case-sensitivity-of-query-results-tp4059801p4059814.html
Sent from the Solr - User mailing list archive at Nabble.com.


java.lang.NullPointerException. I am trying to use CachedSqlEntityProcessor

2013-04-29 Thread srinalluri
I am in Solr 3.6.1. 
The following entity gives java.lang.NullPointerException. How to debug
this? Here I am usingCachedSqlEntityProcessor. 

entity name=vig8-article-mon dataSource=vig8 pk=VCMID 
preImportDeleteQuery=content_type:article AND repository:vig8qamon 
query=select ID as VCMID from tab_story2
  entity name=recordid dataSource=vig8
transformer=TemplateTransformer
  query=select RECORDID from vgnasmomap where keystring1 =
'${vig8-article-mon.VCMID}'
field column=content_type template=article /
field column=RECORDID name=native_id /
field column=repository template=vig8qamon /
  /entity
  entity name=article_details dataSource=vig8
transformer=ClobTransformer,RegexTransformer
  query=select STORY_TITLE, STORY_HEADLINE, SOURCE, DECK,
regexp_replace(body, '\lt;p\gt;\[(pullquote|summary)\]\lt;/p\gt;|\[video
[0-9]+?\]|\[youtube .+?\]', '') as BODY, PUBLISHED_DATE, MODIFIED_DATE,
DATELINE, REPORTER_NAME, TICKER_CODES,ADVERTORIAL_CONTENT from tab_story2
processor=CachedSqlEntityProcessor where=id=vig8-article-mon.VCMID 
field column=STORY_TITLE name=title /
field column=DECK name=description clob=true /
field column=PUBLISHED_DATE name=date /
field column=MODIFIED_DATE name=last_modified_date /
field column=BODY name=body clob=true /
field column=SOURCE name=source /
field column=DATELINE name=dateline /
field column=STORY_HEADLINE name=export_headline /
  /entity
  /entity

Here is the exception message:


SEVERE: Exception while processing: vig8-article-mon document :
SolrInputDocument[{repository=repository(1.0)={vig8qamon},
native_id=native_id(1.0)={8f2210474fea2310VgnVCM10d1c1a8c0RCRD},
content_type=content_type(1.0)={article}}]:org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:333)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.NullPointerException
at java.util.TreeMap.getEntry(TreeMap.java:342)
at java.util.TreeMap.get(TreeMap.java:273)
at
org.apache.solr.handler.dataimport.SortedMapBackedCache.add(SortedMapBackedCache.java:57)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.populateCache(DIHCacheSupport.java:124)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.getIdCacheData(DIHCacheSupport.java:176)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.getCacheData(DIHCacheSupport.java:145)
at
org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:132)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:75)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-lang-NullPointerException-I-am-trying-to-use-CachedSqlEntityProcessor-tp4059815.html
Sent from the Solr - User mailing list archive at Nabble.com.


Exact and Partial Matches

2013-04-29 Thread Sandeep Mestry
Dear Experts,

I have a requirement for the exact matches and applying alphabetical
sorting thereafter.

To illustrate, the results should be sorted in exact matches and all later
alphabetical.

So, if there are 5 documents as below

Doc1
title: trees

Doc 2
title: plum trees

Doc 3
title: Money Trees (Legendary Trees)

Doc 4
title: Cork Trees

Doc 5
title: Old Trees

Then, if user searches with query term as 'trees', the results should be in
following order:

Doc 1 trees - Highest Rank
Doc 4 Cork Trees - Alphabetical afterwards..
Doc 3 Money Trees (Legendary Trees)
Doc 5 Old Trees
Doc 2 plum trees

I can achieve the alphabetical sorting by adding the title sort
parameter, However,
Solr relevancy is higher for Doc 3 (due to matches in 2 terms and so
it arranges
Doc 3 above Doc 4, 5 and 2).
So, it looks like:

Doc 1 trees - Highest Rank
Doc 3 Money Trees (Legendary Trees)
Doc 4 Cork Trees - Alphabetical afterwards..
Doc 5 Old Trees
Doc 2 plum trees

Can you tell me an easy way to achieve this requirement please?

I'm using Solr 4.0 and the *title *field is defined as follows:

fieldType name=text_wc class=solr.TextField positionIncrementGap=100

analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0 generateWordParts=1 generateNumberParts=1
catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1
splitOnNumerics=0 preserveOriginal=1 /
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory
stemEnglishPossessive=0 generateWordParts=1 generateNumberParts=1
catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1
splitOnNumerics=0 preserveOriginal=1 /
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType



Many Thanks in advance,
Sandeep


RE: java.lang.NullPointerException. I am trying to use CachedSqlEntityProcessor

2013-04-29 Thread Dyer, James
This sounds like https://issues.apache.org/jira/browse/SOLR-3791, which was 
resolved in 3.6.2 / 4.0.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: srinalluri [mailto:nallurisr...@yahoo.com] 
Sent: Monday, April 29, 2013 11:41 AM
To: solr-user@lucene.apache.org
Subject: java.lang.NullPointerException. I am trying to use 
CachedSqlEntityProcessor

I am in Solr 3.6.1. 
The following entity gives java.lang.NullPointerException. How to debug
this? Here I am usingCachedSqlEntityProcessor. 

entity name=vig8-article-mon dataSource=vig8 pk=VCMID 
preImportDeleteQuery=content_type:article AND repository:vig8qamon 
query=select ID as VCMID from tab_story2
  entity name=recordid dataSource=vig8
transformer=TemplateTransformer
  query=select RECORDID from vgnasmomap where keystring1 =
'${vig8-article-mon.VCMID}'
field column=content_type template=article /
field column=RECORDID name=native_id /
field column=repository template=vig8qamon /
  /entity
  entity name=article_details dataSource=vig8
transformer=ClobTransformer,RegexTransformer
  query=select STORY_TITLE, STORY_HEADLINE, SOURCE, DECK,
regexp_replace(body, '\lt;p\gt;\[(pullquote|summary)\]\lt;/p\gt;|\[video
[0-9]+?\]|\[youtube .+?\]', '') as BODY, PUBLISHED_DATE, MODIFIED_DATE,
DATELINE, REPORTER_NAME, TICKER_CODES,ADVERTORIAL_CONTENT from tab_story2
processor=CachedSqlEntityProcessor where=id=vig8-article-mon.VCMID 
field column=STORY_TITLE name=title /
field column=DECK name=description clob=true /
field column=PUBLISHED_DATE name=date /
field column=MODIFIED_DATE name=last_modified_date /
field column=BODY name=body clob=true /
field column=SOURCE name=source /
field column=DATELINE name=dateline /
field column=STORY_HEADLINE name=export_headline /
  /entity
  /entity

Here is the exception message:


SEVERE: Exception while processing: vig8-article-mon document :
SolrInputDocument[{repository=repository(1.0)={vig8qamon},
native_id=native_id(1.0)={8f2210474fea2310VgnVCM10d1c1a8c0RCRD},
content_type=content_type(1.0)={article}}]:org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:333)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.NullPointerException
at java.util.TreeMap.getEntry(TreeMap.java:342)
at java.util.TreeMap.get(TreeMap.java:273)
at
org.apache.solr.handler.dataimport.SortedMapBackedCache.add(SortedMapBackedCache.java:57)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.populateCache(DIHCacheSupport.java:124)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.getIdCacheData(DIHCacheSupport.java:176)
at
org.apache.solr.handler.dataimport.DIHCacheSupport.getCacheData(DIHCacheSupport.java:145)
at
org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:132)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:75)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/java-lang-NullPointerException-I-am-trying-to-use-CachedSqlEntityProcessor-tp4059815.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Bizarre Solr issue

2013-04-29 Thread Shawn Heisey

On 4/29/2013 8:15 AM, jack.drysdale@ustranscom.mil wrote:

Production environment is *nix running CF 9.0.0, with both Verity and Solr
collections.

Trying to list collections is breaking - one collection in particular is
breaking the CFCOLLECTION action=list: Error message states that the
solrconfig.xml file cannot be found.

I unregistered this collection via CFAdmin, then went into the file system
and deleted the folders for this collection and restarted both Application
and Solr services. Ran the script, again, and still getting the same error
message for the collection that we just completely removed.  It's NOT being
cached in the browser.

This is working fine in development (Windows environment, CF9.0.1).


CFCOLLECTION and CFAdmin are not part of Solr.  We have no way of 
knowing what happens when you do things in CFAdmin.  I do have one 
possible idea of what might be going wrong here, though.


Here's how multi-core Solr works in all versions prior to 4.3: The 
directory named with the solr.solr.home property (defaulting to ./solr) 
contains a file called solr.xml.  This file describes the index cores 
that Solr knows about and defines a few global settings.  Solr includes 
something called the CoreAdmin API for manipulating cores and solr.xml, 
which is probably utilized by CFAdmin.


If the solr.xml file is missing an attribute called persistent on the 
solr tag, or that attribute is set to false, then changes made using 
the CoreAdmin API are not persisted in the solr.xml file on disk, so 
when Solr restarts, it will use what it had before.


Note: SolrCloud (4.0 and later) does add the concept of collections - a 
cluster-wide view of multiple cores.  SolrCloud is not required, and 
with version 1.4.1, you won't have to worry about it.


Thanks,
Shawn



Re: Customizing Solr GUI

2013-04-29 Thread kneerosh
Thanks a lot for the responses. Now Im sure I need blacklight. 

Suppose I had a website designed using any other standard method- how would
I have embedded a solr search in it? Velocity  Blacklight are , as I
understand useful when you are building a system from scratch and can design
a new search specific GUI.

But what if I had a website, and i needed to embed solr search in it- how
would I have called solr queries from an existing website? Any tutorial
available for this?

A newbie, so please bear with me.






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Customizing-Solr-GUI-tp4059257p4059823.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Bizarre Solr issue

2013-04-29 Thread Jack.Drysdale.ctr
Hello, Shawn, and thanks for your reply.

I will look into this, ASAP.  I know that on one of the dev environments the
persistent flag is set to true; I'll check the others and the production.

I will also see if someone can get me a copy of the logs from the production
environment to see if any more detail is contained within.

Thanks,

Jack

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Monday, April 29, 2013 12:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Bizarre Solr issue

On 4/29/2013 8:15 AM, jack.drysdale@ustranscom.mil wrote:
 Production environment is *nix running CF 9.0.0, with both Verity and 
 Solr collections.

 Trying to list collections is breaking - one collection in particular 
 is breaking the CFCOLLECTION action=list: Error message states that 
 the solrconfig.xml file cannot be found.

 I unregistered this collection via CFAdmin, then went into the file 
 system and deleted the folders for this collection and restarted both 
 Application and Solr services. Ran the script, again, and still 
 getting the same error message for the collection that we just 
 completely removed.  It's NOT being cached in the browser.

 This is working fine in development (Windows environment, CF9.0.1).

CFCOLLECTION and CFAdmin are not part of Solr.  We have no way of knowing
what happens when you do things in CFAdmin.  I do have one possible idea of
what might be going wrong here, though.

Here's how multi-core Solr works in all versions prior to 4.3: The directory
named with the solr.solr.home property (defaulting to ./solr) contains a
file called solr.xml.  This file describes the index cores that Solr knows
about and defines a few global settings.  Solr includes something called the
CoreAdmin API for manipulating cores and solr.xml, which is probably
utilized by CFAdmin.

If the solr.xml file is missing an attribute called persistent on the solr
tag, or that attribute is set to false, then changes made using the
CoreAdmin API are not persisted in the solr.xml file on disk, so when Solr
restarts, it will use what it had before.

Note: SolrCloud (4.0 and later) does add the concept of collections - a
cluster-wide view of multiple cores.  SolrCloud is not required, and with
version 1.4.1, you won't have to worry about it.

Thanks,
Shawn



smime.p7s
Description: S/MIME cryptographic signature


Re: Customizing Solr GUI

2013-04-29 Thread Alexandre Rafalovitch
Does the website have a middleware? As in, is it static website or
something served dynamically from PHP, Ruby, Java, etc? If the later,
then you do the same thing blacklight does: you run your Solr server
and your middleware talks to it over HTTP connection. Then, you have
to figure out how to get data into Solr, so you need to index the
content from the database/files/etc.

If your website is fully static, then you still need some sort of
middleware for search, as it is not Safe exposing Solr directly to
users.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Mon, Apr 29, 2013 at 1:22 PM, kneerosh roshni_rajago...@yahoo.co.in wrote:
 Thanks a lot for the responses. Now Im sure I need blacklight.

 Suppose I had a website designed using any other standard method- how would
 I have embedded a solr search in it? Velocity  Blacklight are , as I
 understand useful when you are building a system from scratch and can design
 a new search specific GUI.

 But what if I had a website, and i needed to embed solr search in it- how
 would I have called solr queries from an existing website? Any tutorial
 available for this?

 A newbie, so please bear with me.






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Customizing-Solr-GUI-tp4059257p4059823.html
 Sent from the Solr - User mailing list archive at Nabble.com.


4.2.1 Tutorial

2013-04-29 Thread Jon Strayer
I can't be the only person to run into this, but I can't find any mention
of it anywhere.

I have Solr 4.2.1 installed under OSX 10.8.3.  I'm working my way through
the tutorial.

When I click on this link:
http://localhost:8983/solr/#/collection1/queryI get the error message
There exists no core with the name collection1.

This link works:
http://localhost:8983/solr/collection1/select?q=solrwt=xml

What am I doing wrong?

[image: Inline image 1]

-- 
To *know* is one thing, and to know for certain *that* we know is another.
--William James


LinkedIn'de bağlantı kurma daveti

2013-04-29 Thread somer81
LinkedIn




vibhoreng04 Lucene],

Sizi LinkedIn'deki profesyonel ağıma eklemek istiyorum.

- ömer sevinç

ömer sevinç
Ondokuzmayıs Üniversitesi Uzaktan Eğitim Merkezi şirketinde Öğr. Gör. 
Bilgisayar Müh. pozisyonunda
Samsun, Türkiye

ömer sevinç adlı kişiyi tanıdığınızı onaylayın:
https://www.linkedin.com/e/-raxvo-hg440jtk-5y/isd/12853395556/gr7DJb-a/?hs=falsetok=1P2JgADTvmJBI1

--
Bağlantı kurmak için davet e-postaları alıyorsunuz. Aboneliği iptal etmek için 
tıklayın:
http://www.linkedin.com/e/-raxvo-hg440jtk-5y/k4DmoUl0INtaVAIq-J7z9PWKN77TMrUq-KEzulsJgeVVicpw-KNocoLGnzC/goo/ml-node%2Bs472066n3787592h24%40n3%2Enabble%2Ecom/20061/I4256117001_1/?hs=falsetok=3veZCCU2TmJBI1

(c) 2012 LinkedIn Corporation 2029 Stierlin Ct., Mountain View, CA 94043 USA


  




--
View this message in context: 
http://lucene.472066.n3.nabble.com/LinkedIn-de-ba-lant-kurma-daveti-tp4059846.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: 4.2.1 Tutorial

2013-04-29 Thread Furkan KAMACI
Check your logs when you startup Solr if you get that error: There exists
no core with the name collection1. Do you get any error as like
core:collection1 could not create or something like that?

2013/4/29 Jon Strayer j...@strayer.org

 I can't be the only person to run into this, but I can't find any mention
 of it anywhere.

 I have Solr 4.2.1 installed under OSX 10.8.3.  I'm working my way through
 the tutorial.

 When I click on this link:  http://localhost:8983/solr/#/collection1/queryI 
 get the error message There exists no core with the name collection1.

 This link works:
 http://localhost:8983/solr/collection1/select?q=solrwt=xml

 What am I doing wrong?

 [image: Inline image 1]

 --
 To *know* is one thing, and to know for certain *that* we know is another.
 --William James



What Happens to Consistency if I kill a Leader and Startup it again?

2013-04-29 Thread Furkan KAMACI
I think about such situation:

Let's assume that I am indexing at my SolrCloud. My leader has a version of
higher than replica as well (I have one leader and one replica for each
shard). If I kill leader, replica will be leader as well. When I startup
old leader again it will be a replica for my shard.

However I think that leader will have less document than replica and a less
version than replica. Does it cause a problem because of leader is behing
of replica?


Using properties from solrcore.properties in data-config.xml (Solr 4.2.1)

2013-04-29 Thread Arun Rangarajan
We are trying to upgrade from Solr 3.6.2 to Solr 4.2.1 and are having
problems with using properties in solrcore.properties inside
data-config.xml.

With Solr 3.6.2, we were able to directly use properties in
solrcore.properties inside data-config.xml like ${jdbc.driver},
${jdbc.username}, etc., but these no longer work. We get an exception like
this:

-

SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Could not
load driver:  Processing Document # 1

at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:266)

at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)

at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)

at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)

Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Could not
load driver:  Processing Document # 1

at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:406)

at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)

at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)

... 3 more

Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Could not load driver:  Processing Document # 1

at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)

at
org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:114)

at
org.apache.solr.handler.dataimport.JdbcDataSource.init(JdbcDataSource.java:62)

at
org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:394)

at
org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:99)

at
org.apache.solr.handler.dataimport.SqlEntityProcessor.init(SqlEntityProcessor.java:53)

at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:74)

at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:423)

at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)

... 5 more

Caused by: java.lang.ClassNotFoundException: Unable to load  or
org.apache.solr.handler.dataimport.

at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:911)

at
org.apache.solr.handler.dataimport.JdbcDataSource.createConnectionFactory(JdbcDataSource.java:112)

... 12 more

Caused by: org.apache.solr.common.SolrException: Error loading class ''

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:440)

at
org.apache.solr.handler.dataimport.DocBuilder.loadClass(DocBuilder.java:901)

... 13 more

Caused by: java.lang.ClassNotFoundException:

at java.lang.Class.forName0(Native Method)

at java.lang.Class.forName(Unknown Source)

at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:424)

... 14 more

-

If I hard-code the property values in data-config.xml, then the import
works fine.

I even tried configuring them in solr-config.xml like

requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler

lst name=defaults

str name=configdata-config.xml/str

str name=jdbcDriver${jdbc.driver}/str

str name=jdbcHost${jdbc.host}/str

str name=jdbcPort${jdbc.port}/str

str name=jdbcUsername${jdbc.username}/str

str name=jdbcPassword${jdbc.password}/str

/lst

/requestHandler

and use them in data-config.xml like ${dataimport.jdbcDriver} and ${
dataimport.request.jdbcDriver}, but these don't work either.

So how does one pass properties to data-config.xml in Solr 4.2.1?

Any help will be appreciated.

Thanks.


Bloom filters and optimized vs. unoptimized indices

2013-04-29 Thread Otis Gospodnetic
Hi,

I was looking at
http://lucene.apache.org/core/4_2_1/codecs/org/apache/lucene/codecs/bloom/BloomFilteringPostingsFormat.html
and this piece of text:

A PostingsFormat useful for low doc-frequency fields such as primary
keys. Bloom filters are maintained in a .blm file which offers
fast-fail for reads in segments known to have no record of the key.


Is this implying that if you are doing PK lookups AND you have a large
index (i.e. slow queries) it may actually be better to keep the index
unoptimized, so whole index segments can be skipped?

Thanks,
Otis
--
SOLR Performance Monitoring - http://sematext.com/spm/index.html