Zookeeper could not read dataimport.properties

2013-09-21 Thread Prasi S
Hi,
Im using solr 4.4 cloud setup with external zookeeper and solr running in
tomcat.

For the initial indexing, i use csv files to load data to solr. Then we do
delta indexing from database table.

My zookeeper always thorws exception saying could not read
dataimport.properties.

Where should we configure zookeeper to create / read dataimport.properties?

Thanks,
Prasi


Atomic updates with solr cloud in solr 4.4

2013-09-21 Thread Sesha Sendhil Subramanian
Hi,

I am using solr 4.4 with 2 shards and 2 collections per shard, search and
meta.
I started the shards specifying numShards and have checked that the router
used is the compositeId router.
Distributed indexing is done based on ids sharing the same domain/prefix,
i.e. 'customerB!' form and the documents are distributed in the shards
correctly.
Querying for documents works as expected and returns all matching documents
across shards.
I try to do an atomic update on search collection as follows

curl http://localhost:8983/solr/search/update -H
'Content-type:application/json' -d '
[
 {
  id:
c8cce27c1d8129d733a3df3de68dd675!c8cce27c1d8129d733a3df3de68dd675,
  link_id_45454 : {set:abcdegff}
 }
]'

If the document resides on shard queried i.e localhost:8983, the update
succeeds. If the document resides on shard 2, i.e localhost:7574, the
update fails and the error message I get is as follows

15438547 [qtp386373885-75] INFO
 org.apache.solr.update.processor.LogUpdateProcessor  ? [search]
webapp=/solr path=/update params={} {} 0 1
15438548 [qtp386373885-75] ERROR org.apache.solr.core.SolrCore  ?
org.apache.solr.common.SolrException:
[doc=c8cce27c1d8129d733a3df3de68dd675!c8cce27c1d8129d733a3df3de68dd675]
missing required field: variant_count
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:189)
at
org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:73)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:210)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:556)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:692)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:392)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:117)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at

Re: java.lang.LinkageError when using custom filters in multiple cores

2013-09-21 Thread Alexandre Rafalovitch
Did you try latest solr? There was a library loading bug with multiple
cores. Not a perfect match to your description but close enough.

Regards,
Alex
On 21 Sep 2013 02:28, Hayden Muhl haydenm...@gmail.com wrote:

 I have two cores favorite and user running in the same Tomcat instance.
 In each of these cores I have identical field types text_en, text_de,
 text_fr, and text_ja. These fields use some custom token filters I've
 written. Everything was going smoothly when I only had the favorite core.
 When I added the user core, I started getting java.lang.LinkageErrors
 being thrown when I start up Tomcat. The error always happens with one of
 the classes I've written, but it's unpredictable which class the
 classloader chokes on.

 Here's the really strange part. I comment out the text_* fields in the
 user core and the errors go away (makes sense). I add text_en back in, no
 error (OK). I add text_fr back in, no error (OK). I add text_de back
 in, and I get the error (ah ha!). I comment text_de out again, and I
 still get the same error (wtf?).

 I also put a break point at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:424),
 and when I load everything one at a time, I don't get any errors.

 I'm running Tomcat 5.5.28, Java version 1.6.0_39 and Solr 4.2.0. I'm
 running this all within Eclipse 1.5.1 on a mac. I have not tested this on a
 production-like system yet.

 Here's an example stack trace. In this case it was one of my Japanese
 filters, but other times it will choke on my synonym filter, or my compound
 word filter. The specific class it fails on doesn't seem to be relevant.

 SEVERE: null:java.lang.LinkageError: loader (instance of
  org/apache/catalina/loader/WebappClassLoader): attempted  duplicate class
 definition for name: com/shopstyle/solrx/KatakanaVuFilterFactory
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
 at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
 at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
 at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
 at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at

 org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:904)
 at

 org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1353)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:295)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:249)
 at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:424)
 at

 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:462)
 at

 org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:89)
 at

 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 at

 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:392)
 at

 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:86)
 at

 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
 at

 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:373)
 at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:121)
 at
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1018)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
 at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
 at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:680)

 - Hayden



Re: Cause of NullPointer Exception? (Solr with Spring Data)

2013-09-21 Thread Furkan KAMACI
Your solr server may not bet working correctly. You should give us
information about your solr logs instead of Spring. Can you reach Solr
admin page?

20 Eylül 2013 Cuma tarihinde JMill apprentice...@googlemail.com adlı
kullanıcı şöyle yazdı:
 I am unsure about the cause of the following NullPointer Exception.  Any
 Ideas?

 Thanks

 Exception in thread main
 org.springframework.beans.factory.BeanCreationException: Error creating
 bean with name 'aDocumentService': Injection of autowired dependencies
 failed; nested exception is
 org.springframework.beans.factory.BeanCreationException: Could not
autowire
 field: com.project.core.solr.repository.DocumentRepository
 com.project.core.solr.service.impl.DocumentServiceImpl.DocRepo; nested
 exception is org.springframework.beans.factory.BeanCreationException:
Error
 creating bean with name 'DocumentRepository': FactoryBean threw exception
 on object creation; nested exception is java.lang.NullPointerException
 at

org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:288)
 at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1116)
 at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:519)
 at

org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458)
 at

org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
 at

org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
 at

org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
 at

org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
 at

org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:626)
 at

org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
 at

org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
 at

org.springframework.context.annotation.AnnotationConfigApplicationContext.init(AnnotationConfigApplicationContext.java:73)
 at com.project.core.solr..DocumentTester.main(DocumentTester.java:18)
 Caused by: org.springframework.beans.factory.BeanCreationException: Could
 not autowire field: com.project.core.solr.repository.DocumentRepository
 com.project.core.solr.service.impl.DocumentServiceImpl.DocRepo; nested
 exception is org.springframework.beans.factory.BeanCreationException:
Error
 creating bean with name 'DocumentRepository': FactoryBean threw exception
 on object creation; nested exception is java.lang.NullPointerException
 at

org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:514)
 at

org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:87)
 at

org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:285)
 ... 12 more
 Caused by: org.springframework.beans.factory.BeanCreationException: Error
 creating bean with name 'DocumentRepository': FactoryBean threw exception
 on object creation; nested exception is java.lang.NullPointerException
 at

org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:149)
 at

org.springframework.beans.factory.support.FactoryBeanRegistrySupport.getObjectFromFactoryBean(FactoryBeanRegistrySupport.java:102)
 at

org.springframework.beans.factory.support.AbstractBeanFactory.getObjectForBeanInstance(AbstractBeanFactory.java:1454)
 at

org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:306)
 at

org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
 at

org.springframework.beans.factory.support.DefaultListableBeanFactory.findAutowireCandidates(DefaultListableBeanFactory.java:910)
 at

org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:853)
 at

org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:768)
 at

org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:486)
 ... 14 more
 Caused by: 

Re: Cause of NullPointer Exception? (Solr with Spring Data)

2013-09-21 Thread JMill
I am able to reach http://localhost:8983/solr/#/

Here is the Log Content.  It not much.

Time Level Logger Message  12:38:47 WARN SolrCore [collection1] Solr index
directory
'/usr/local/Cellar/solr/4.4.0/libexec/example/solr/collection1/data/index'
doesn't exist. Creating new index..



On Sat, Sep 21, 2013 at 12:57 PM, Furkan KAMACI furkankam...@gmail.comwrote:

 Your solr server may not bet working correctly. You should give us
 information about your solr logs instead of Spring. Can you reach Solr
 admin page?

 20 Eylül 2013 Cuma tarihinde JMill apprentice...@googlemail.com adlı
 kullanıcı şöyle yazdı:
  I am unsure about the cause of the following NullPointer Exception.  Any
  Ideas?
 
  Thanks
 
  Exception in thread main
  org.springframework.beans.factory.BeanCreationException: Error creating
  bean with name 'aDocumentService': Injection of autowired dependencies
  failed; nested exception is
  org.springframework.beans.factory.BeanCreationException: Could not
 autowire
  field: com.project.core.solr.repository.DocumentRepository
  com.project.core.solr.service.impl.DocumentServiceImpl.DocRepo; nested
  exception is org.springframework.beans.factory.BeanCreationException:
 Error
  creating bean with name 'DocumentRepository': FactoryBean threw exception
  on object creation; nested exception is java.lang.NullPointerException
  at
 

 org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:288)
  at
 

 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1116)
  at
 

 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:519)
  at
 

 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
  at
 

 org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
  at
 

 org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:626)
  at
 

 org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
  at
 

 org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
  at
 

 org.springframework.context.annotation.AnnotationConfigApplicationContext.init(AnnotationConfigApplicationContext.java:73)
  at com.project.core.solr..DocumentTester.main(DocumentTester.java:18)
  Caused by: org.springframework.beans.factory.BeanCreationException: Could
  not autowire field: com.project.core.solr.repository.DocumentRepository
  com.project.core.solr.service.impl.DocumentServiceImpl.DocRepo; nested
  exception is org.springframework.beans.factory.BeanCreationException:
 Error
  creating bean with name 'DocumentRepository': FactoryBean threw exception
  on object creation; nested exception is java.lang.NullPointerException
  at
 

 org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:514)
  at
 

 org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:87)
  at
 

 org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:285)
  ... 12 more
  Caused by: org.springframework.beans.factory.BeanCreationException: Error
  creating bean with name 'DocumentRepository': FactoryBean threw exception
  on object creation; nested exception is java.lang.NullPointerException
  at
 

 org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:149)
  at
 

 org.springframework.beans.factory.support.FactoryBeanRegistrySupport.getObjectFromFactoryBean(FactoryBeanRegistrySupport.java:102)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory.getObjectForBeanInstance(AbstractBeanFactory.java:1454)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:306)
  at
 

 org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
  at
 

 

Re: searching within documents

2013-09-21 Thread Nutan
I have been trying to resolve the problem of searching within doc,it wasnt
working so I thought of installing solr on other system.I followed the same
process-to install tomcat-create solr-home folder-solr.xml-then I get
the homepage(admin)of solr and followed Solr cookbook for extracting handler
but I get this error:
update/extract/ not found on this server.
Now I am stuck at both the systems.Therefore two different errors on
different machines.
Coming back to this error,I want to search within documents that is the
contents of schema.xml :
schema name=documents
fields
field name=id type=string indexed=true stored=true required=true
multiValued=false/
field name=author type=string indexed=true stored=true
multiValued=true/
field name=comments type=text indexed=true stored=true
multiValued=false/
field name=keywords type=text indexed=true stored=true
multiValued=false/
field name=contents type=string indexed=true stored=true
multiValued=false/
field name=title type=text indexed=true stored=true
multiValued=false/
field name=revision_number type=string indexed=true stored=true
multiValued=false/

field name=_version_ type=long indexed=true stored=true
multiValued=false/
dynamicField name=ignored_* type=string indexed=false stored=true
multiValued=true/
/fields
types
fieldType name=string class=solr.StrField /
fieldType name=integer class=solr.IntField /
fieldType name=long class=solr.LongField /

fieldType name=text class=solr.TextField 
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/…
/analyzer
/fieldType
fieldtype name=ignored stored=false indexed=false multiValued=true
class=solr.StrField /
/types
uniqueKeyid/uniqueKey
/schemaschema name=documents
fields
field name=id type=string indexed=true stored=true required=true
multiValued=false/
field name=author type=string indexed=true stored=true
multiValued=true/
field name=comments type=text indexed=true stored=true
multiValued=false/
field name=keywords type=text indexed=true stored=true
multiValued=false/
field name=contents type=string indexed=true stored=true
multiValued=false/
field name=title type=text indexed=true stored=true
multiValued=false/
field name=revision_number type=string indexed=true stored=true
multiValued=false/

field name=_version_ type=long indexed=true stored=true
multiValued=false/
dynamicField name=ignored_* type=string indexed=false stored=true
multiValued=true/
/fields
types
fieldType name=string class=solr.StrField /
fieldType name=integer class=solr.IntField /
fieldType name=long class=solr.LongField /

fieldType name=text class=solr.TextField 
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/…
/analyzer
/fieldType
fieldtype name=ignored stored=false indexed=false multiValued=true
class=solr.StrField /
/types
uniqueKeyid/uniqueKey
/schema

In my solrconfig I have defined the standard handler for  select as :
requestHandler name=standard class=solr.StandardRequestHandler
default=true
lst name=defaults
   int name=rows20/int
   str name=fl*/str
 /lst
/requestHandler

This is the example doc which i want to search,(this is the output for *:*
query)
doc
str name=id8/str
arr name=author
strnutan shinde/str
/arr
str name=commentsbest book for solr/str
str name=keywordssolr,lucene,apache tika/str
str name=contents
solr,lucene is used for search based service.Google works uses web
crawler.Lucene can implement web crawler
/str
str name=titlesolr enterprise search server/str
str name=revision_number00123467889767/s…
amp;lt;/doc

I indexed this record through indexing using xml file.
And I have no idea about copy fields,so  please help me.
My Tomcat is working normal.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/searching-within-documents-tp4090173p4091368.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Atomic updates with solr cloud in solr 4.4

2013-09-21 Thread Yonik Seeley
I can't reproduce this.
I tried starting up a 2 shard cluster and then followed the example here:
http://yonik.com/solr/atomic-updates/

book1 was on shard2 (port 7574) and everything still worked fine.

 missing required field: variant_count

Perhaps the problem is document specific... What can you say about
this variant_count field?
Is it stored?  Is it the target of a copyField?


-Yonik
http://lucidworks.com




On Tue, Sep 17, 2013 at 12:56 PM, Sesha Sendhil Subramanian
seshasend...@indix.com wrote:
 curl http://localhost:8983/solr/search/update -H
 'Content-type:application/json' -d '
 [
  {
   id:
 c8cce27c1d8129d733a3df3de68dd675!c8cce27c1d8129d733a3df3de68dd675,
   link_id_45454 : {set:abcdegff}
  }
 ]'

 I have two collections search and meta. I want to do an update in the
 search collection.
 If i pick a document in same shard : localhost:8983, the update succeeds

 15350327 [qtp386373885-19] INFO
  org.apache.solr.update.processor.LogUpdateProcessor  ? [search]
 webapp=/solr path=/update params={}
 {add=[6cfcb56ca52b56ccb1377a7f0842e74d!6cfcb56ca52b56ccb1377a7f0842e74d
 (1446444025873694720)]} 0 5

 If i pick a document on a different shard : localhost:7574, the update fails

 15438547 [qtp386373885-75] INFO
  org.apache.solr.update.processor.LogUpdateProcessor  ? [search]
 webapp=/solr path=/update params={} {} 0 1
 15438548 [qtp386373885-75] ERROR org.apache.solr.core.SolrCore  ?
 org.apache.solr.common.SolrException:
 [doc=c8cce27c1d8129d733a3df3de68dd675!c8cce27c1d8129d733a3df3de68dd675]
 missing required field: variant_count
 at
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:189)
 at
 org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:73)
 at
 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:210)
 at
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
 at
 org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:556)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:692)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435)
 at
 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
 at
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:392)
 at
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:117)
 at
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
 at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
 at
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
 at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
 at org.eclipse.jetty.server.Server.handle(Server.java:368)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
 at
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
 at
 

Re: requested url solr/update/extract not available on this server

2013-09-21 Thread Nutan
Yes I do get the solr admin page.And im not using the example config file,I
have create mine own for my project as required.I have also defined
update/extract in solrconfig.xml.


On Tue, Sep 17, 2013 at 4:45 AM, Chris Hostetter-3 [via Lucene] 
ml-node+s472066n409045...@n3.nabble.com wrote:


 : Is /solr/update working?

 more importantly: does /solr/ work in your browser and return anything
 useful?  (nothing you've told us yet gives us anyway of knowning if
 solr is even up and running)

 if 'http://localhost:8080/solr/' shows you the solr admin UI, and you are
 using the stock Solr 4.2 example configs, then
 http://localhost:8080/solr/update/extract should not give you a 404
 error.

 if however you are using some other configs, it might not work unless
 those configs register a handler with the path /update/extract.

 Using the jetty setup provided with 4.2, and the example configs (from
 4.2) I was able to index a sample PDF just fine using your curl command...

 hossman@frisbee:~/tmp$ curl 
 http://localhost:8983/solr/update/extract?literal.id=1commit=true; -F
 myfile=@stump.winners.san.diego.2013.pdf
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime1839/int/lst
 /response





 :
 : Check solrconfig to see that /update/extract is configured as in the
 standard
 : Solr example.
 :
 : Does /solr/update/extract work for you using the standard Solr example?
 :
 : -- Jack Krupansky
 :
 : -Original Message- From: Nutan
 : Sent: Sunday, September 15, 2013 2:37 AM
 : To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4090459i=0
 : Subject: requested url solr/update/extract not available on this server
 :
 : I am working on Solr 4.2 on Windows 7. I am trying to index pdf files.I
 : referred Solr Cookbook 4. Tomcat is using 8080 port number. I get this
 : error:requested url solr/update/extract not available on this server
 : When my curl is :
 : curl http://localhost:8080/solr/update/extract?literal.id=1commit=true;
 -F
 : myfile=@cookbook.pdf
 : There is no entry in log files. Please help.
 :
 :
 :
 : --
 : View this message in context:
 :
 http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153.html
 : Sent from the Solr - User mailing list archive at Nabble.com.
 :

 -Hoss


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153p4090459.html
  To unsubscribe from requested url solr/update/extract not available on
 this server, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4090153code=bnV0YW5zaGluZGUxOTkyQGdtYWlsLmNvbXw0MDkwMTUzfC0xMzEzOTU5Mzcx
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153p4091371.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: searching within documents

2013-09-21 Thread Nutan Shinde
And this works:
localhost:8080/solr/select?q=title:solr
this gives output as required doc,
but
localhost:8080/solr/select?q=contents:solr
 gives num found as 0

This is the new edited schema.xml :
schema name=documents
fields
field name=id type=string indexed=true stored=true required=true
multiValued=false/
field name=author type=string indexed=true stored=true
multiValued=true/
field name=comments type=text indexed=true stored=true
multiValued=false/
field name=keywords type=text indexed=true stored=true
multiValued=false/
field name=contents type=text indexed=true stored=true
multiValued=false/
field name=title type=text indexed=true stored=true
multiValued=false/
field name=revision_number type=string indexed=true stored=true
multiValued=false/

field name=_version_ type=long indexed=true stored=true
multiValued=false/
dynamicField name=ignored_* type=string indexed=false stored=true
multiValued=true/
copyfield source=* dest=text /
/fields
types
fieldType name=string class=solr.StrField /
fieldType name=integer class=solr.IntField /
fieldType name=long class=solr.LongField /
fieldType name=text class=solr.TextField 
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/…
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/…
/analyzer
/fieldType
fieldtype name=ignored stored=false indexed=false multiValued=true
class=solr.StrField /
/types
uniqueKeyid/uniqueKey
/schema


On Sat, Sep 21, 2013 at 7:58 PM, Nutan nutanshinde1...@gmail.com wrote:

 I have been trying to resolve the problem of searching within doc,it wasnt
 working so I thought of installing solr on other system.I followed the same
 process-to install tomcat-create solr-home folder-solr.xml-then I get
 the homepage(admin)of solr and followed Solr cookbook for extracting
 handler
 but I get this error:
 update/extract/ not found on this server.
 Now I am stuck at both the systems.Therefore two different errors on
 different machines.
 Coming back to this error,I want to search within documents that is the
 contents of schema.xml :
 schema name=documents
 fields
 field name=id type=string indexed=true stored=true required=true
 multiValued=false/
 field name=author type=string indexed=true stored=true
 multiValued=true/
 field name=comments type=text indexed=true stored=true
 multiValued=false/
 field name=keywords type=text indexed=true stored=true
 multiValued=false/
 field name=contents type=string indexed=true stored=true
 multiValued=false/
 field name=title type=text indexed=true stored=true
 multiValued=false/
 field name=revision_number type=string indexed=true stored=true
 multiValued=false/

 field name=_version_ type=long indexed=true stored=true
 multiValued=false/
 dynamicField name=ignored_* type=string indexed=false stored=true
 multiValued=true/
 /fields
 types
 fieldType name=string class=solr.StrField /
 fieldType name=integer class=solr.IntField /
 fieldType name=long class=solr.LongField /

 fieldType name=text class=solr.TextField 
 analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/…
 /analyzer
 /fieldType
 fieldtype name=ignored stored=false indexed=false multiValued=true
 class=solr.StrField /
 /types
 uniqueKeyid/uniqueKey
 /schemaschema name=documents
 fields
 field name=id type=string indexed=true stored=true required=true
 multiValued=false/
 field name=author type=string indexed=true stored=true
 multiValued=true/
 field name=comments type=text indexed=true stored=true
 multiValued=false/
 field name=keywords type=text indexed=true stored=true
 multiValued=false/
 field name=contents type=string indexed=true stored=true
 multiValued=false/
 field name=title type=text indexed=true stored=true
 multiValued=false/
 field name=revision_number type=string indexed=true stored=true
 multiValued=false/

 field name=_version_ type=long indexed=true stored=true
 multiValued=false/
 dynamicField name=ignored_* type=string indexed=false stored=true
 multiValued=true/
 /fields
 types
 fieldType name=string class=solr.StrField /
 fieldType name=integer class=solr.IntField /
 fieldType name=long class=solr.LongField /

 fieldType name=text class=solr.TextField 
 analyzer
 tokenizer class=solr.WhitespaceTokenizerFactory/…
 /analyzer
 /fieldType
 fieldtype name=ignored stored=false indexed=false multiValued=true
 class=solr.StrField /
 /types
 uniqueKeyid/uniqueKey
 /schema

 In my solrconfig I have defined the standard handler for  select as :
 requestHandler name=standard class=solr.StandardRequestHandler
 default=true
 lst name=defaults
int name=rows20/int
str name=fl*/str
  /lst
 /requestHandler

 This is the example doc which i want to search,(this is the output for *:*
 query)
 doc
 str name=id8/str
 arr name=author
 strnutan shinde/str
 /arr
 str name=commentsbest book for solr/str
 str name=keywordssolr,lucene,apache tika/str
 str name=contents
 solr,lucene is used for search based service.Google works uses web
 crawler.Lucene can implement web crawler
 /str

Re: solr atomic updates stored=true, and copyField limitation

2013-09-21 Thread Shawn Heisey
On 9/19/2013 6:47 AM, Tanguy Moal wrote:
 Quoting http://wiki.apache.org/solr/Atomic_Updates#Caveats_and_Limitations :
 all fields in your SchemaXml must be configured as stored=true except for 
 fields which are copyField/ destinations -- which must be configured as 
 stored=false

For fields created by copyField, the source field(s) should have
stored=true.  The destination field should have stored=false.

Forgetting about atomic updates for a minute, the reason is pretty
simple, especially if you have multiple source fields being dropped in
one destination fields:  Storing both of them makes your index bigger
and makes it take longer to retrieve search results, particularly with
version numbers 4.1 or later, because stored values are compressed.

I think you've hit on the exact reason why the caveat exists for
copyFields and atomic updates -- if the source field isn't stored, then
the actual indexed document won't have the source field, which means
that the doesn't exist value will be copied over to the destination,
overwriting any actual value that might exist for that field.

It's arguable that it's working as designed, and also working as
documented, both in the wiki and the reference guide, which both say
that all source fields must be stored.

https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
http://wiki.apache.org/solr/Atomic_Updates

You could still file a bug (jira issue) if you like, but given that the
documentation is pretty clear, it might not get fixed.

Thanks,
Shawn


Re: Problem running EmbeddedSolr (spring data)

2013-09-21 Thread Erick Erickson
bq: Caused by: java.lang.NoSuchMethodError:

This usually means that you have a mixture of old and new jars around
and have compiled against one and are finding the other one in your
classpath.

Best,
Erick

On Fri, Sep 20, 2013 at 9:37 AM, JMill apprentice...@googlemail.com wrote:
 What is the cause of this Stactrace?

 Working with the following solr maven dependancies

 solr-core-version4.4.0/
 solr-core-version
 spring-data-solr-version1.0.0.RC1/spring-data-solr-version

 Stacktrace

 SEVERE: Exception sending context initialized event to listener instance of
 class org.springframework.web.context.ContextLoaderListener
 org.springframework.beans.factory.BeanCreationException: Error creating
 bean with name 'solrServerFactoryBean' defined in class path resource
 [com/project/core/config/EmbeddedSolrContext.class]: Invocation of init
 method failed; nested exception is java.lang.NoSuchMethodError:
 org.apache.solr.core.CoreContainer.init(Ljava/lang/String;Ljava/io/File;)V
 at
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1482)
 at
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:521)
 at
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:458)
 at
 org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
 at
 org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
 at
 org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
 at
 org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
 at
 org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:608)
 at
 org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
 at
 org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
 at
 org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:389)
 at
 org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:294)
 at
 org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:112)
 at
 org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4887)
 at
 org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5381)
 at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
 at
 org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
 at
 org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.NoSuchMethodError:
 org.apache.solr.core.CoreContainer.init(Ljava/lang/String;Ljava/io/File;)V
 at
 org.springframework.data.solr.server.support.EmbeddedSolrServerFactory.createPathConfiguredSolrServer(EmbeddedSolrServerFactory.java:96)
 at
 org.springframework.data.solr.server.support.EmbeddedSolrServerFactory.initSolrServer(EmbeddedSolrServerFactory.java:72)
 at
 org.springframework.data.solr.server.support.EmbeddedSolrServerFactoryBean.afterPropertiesSet(EmbeddedSolrServerFactoryBean.java:41)
 at
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1541)
 at
 org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1479)
 ... 22 more



 //Config Class
 @Configuration
 @EnableSolrRepositories(core.solr.repository)
 @Profile(dev)
 @PropertySource(classpath:solr.properties)
 public class EmbeddedSolrContext {
 @Resource
 private Environment environment;

 @Bean
 public EmbeddedSolrServerFactoryBean solrServerFactoryBean() {
 EmbeddedSolrServerFactoryBean factory = new
 EmbeddedSolrServerFactoryBean();


 factory.setSolrHome(environment.getRequiredProperty(solr.solr.home));

 return factory;
 }

 @Bean
 public SolrTemplate solrTemplate() throws Exception {
 return new SolrTemplate(solrServerFactoryBean().getObject());
 }
 

Re: Need help understanding the use cases behind core auto-discovery

2013-09-21 Thread Erick Erickson
Also consider where SolrCloud is going. Trying to correctly maintain
all the solr.xml files yourself on all the nodes would have
been...interesting. On all the machines in your 200 node cluster.
With 17 different collections. With nodes coming and going. With
splitting shards. With.

Collections are almost guaranteed to be distributed unevenly (e.g. a
big collection might have 20 shards and a small collection 3 in the
same cluster). So each node used to require solr.xml to be unique as
far as everything in the cores tag. But everything  _not_ in the
cores tags is common. Say you wanted to change the
shardHandlerFactory (or any other setting we put in solr.xml that
wouldn't have gone into the old cores tag). In the old-style way of
doing things, since each solr.xml file on each node has potentially a
different set of cores, you'd have to edit each and every one of them.

The older way of doing this is fine as long as each solr.xml on each
machine is self-consistent. So auto-discovery essentially automates
that self-consistency.

It also makes it possible to have Zookeeper manage your solr.xml and
auto-distribute it to new nodes (or update existing) which would have
taken a lot of effort to get right without auto-discovery. So changing
the shardHandlerFactory consists of changing the solr.xml file and
pushing it to ZooKeeper (don't quite remember the right JIRA, but you
can do this now).

I suppose it's like all other refactorings. Solr.xml had it's origin
in the single-core days, then when multi-cores came into being it was
expanded to include that information, but eventually became, as Yonik
says, unnecessary central configuration which started becoming a
limitation.

FWIW,
Erick

On Fri, Sep 20, 2013 at 9:45 AM, Timothy Potter thelabd...@gmail.com wrote:
 Exactly the insight I was looking for! Thanks Yonik ;-)


 On Fri, Sep 20, 2013 at 10:37 AM, Yonik Seeley yo...@lucidworks.com wrote:

 On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter thelabd...@gmail.com
 wrote:
  Trying to add some information about core.properties and auto-discovery
 in
  Solr in Action and am at a loss for what to tell the reader is the
 purpose
  of this feature.

 IMO, it was more a removal of unnecessary central configuration.
 You previously had to list the core in solr.xml, and now you don't.
 Cores should be fully self-describing so that it should be easy to
 move them in the future just by moving the core directory (although
 that may not yet work...)

 -Yonik
 http://lucidworks.com

  Can anyone point me to any background information about core
  auto-discovery? I'm not interested in the technical implementation
 details.
  Mainly I'm trying to understand the motivation behind having this feature
  as it seems unnecessary with the Core Admin API. Best I can tell is it
  removes a manual step of firing off a call to the Core Admin API or
 loading
  a core from the Admin UI. If that's it and I'm overthinking it, then cool
  but was expecting more of an ah-ha moment with this feature ;-)
 
  Any insights you can share are appreciated.
 
  Thanks.
  Tim



Re: SolrCloud setup - any advice?

2013-09-21 Thread Erick Erickson
About caches. The queryResultCache is only useful when you expect there
to be a number of _identical_ queries. Think of this cache as a map where
the key is the query and the value is just a list of N document IDs (internal)
where N is your window size. Paging is often the place where this is used.
Take a look at your admin page for this cache, you can see the hit rates.
But, the take-away is that this is a very small cache memory-wise, varying
it is probably not a great predictor of memory usage.

The filterCache is more intense memory wise, it's another map where the
key is the fq clause and the value is bounded by maxDoc/8. Take a
close look at this in the admin screen and see what the hit ratio is. It may
be that you can make it much smaller and still get a lot of benefit.
_Especially_ considering it could occupy about 44G of memory.
(43,000,000 / 8) * 8192 And the autowarm count is excessive in
most cases from what I've seen. Cutting the autowarm down to, say, 16
may not make a noticeable difference in your response time. And if
you're using NOW in your fq clauses, it's almost totally useless, see:
http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/

Also, read Uwe's excellent blog about MMapDirectory here:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
for some problems with over-allocating memory to the JVM. Of course
if you're hitting OOMs, well.

bq: order them by one of their fields.
This is one place I'd look first. How many unique values are in each field
that you sort on? This is one of the major memory consumers. You can
get a sense of this by looking at admin/schema-browser and selecting
the fields you sort on. There's a text box with the number of terms returned,
then a / ### where ### is the total count of unique terms in the field. NOTE:
in 4.4 this will be -1 for multiValued fields, but you shouldn't be sorting on
those anyway. How many fields are you sorting on anyway, and of what types?

For your SolrCloud experiments, what are your soft and hard commit intervals?
Because something is really screwy here. Your sharding moving the
number of docs down this low per shard should be fast. Back to the point
above, the only good explanation I can come up with from this remove is
that the fields you sort on have a LOT of unique values. It's possible that
the total number of unique values isn't scaling with sharding. That is, each
shard may have, say, 90% of all unique terms (number from thin air). Worth
checking anyway, but a stretch.

This is definitely unusual...

Best,
Erick


On Thu, Sep 19, 2013 at 8:20 AM, Neil Prosser neil.pros...@gmail.com wrote:
 Apologies for the giant email. Hopefully it makes sense.

 We've been trying out SolrCloud to solve some scalability issues with our
 current setup and have run into problems. I'd like to describe our current
 setup, our queries and the sort of load we see and am hoping someone might
 be able to spot the massive flaw in the way I've been trying to set things
 up.

 We currently run Solr 4.0.0 in the old style Master/Slave replication. We
 have five slaves, each running Centos with 96GB of RAM, 24 cores and with
 48GB assigned to the JVM heap. Disks aren't crazy fast (i.e. not SSDs) but
 aren't slow either. Our GC parameters aren't particularly exciting, just
 -XX:+UseConcMarkSweepGC. Java version is 1.7.0_11.

 Our index size ranges between 144GB and 200GB (when we optimise it back
 down, since we've had bad experiences with large cores). We've got just
 over 37M documents some are smallish but most range between 1000-6000
 bytes. We regularly update documents so large portions of the index will be
 touched leading to a maxDocs value of around 43M.

 Query load ranges between 400req/s to 800req/s across the five slaves
 throughout the day, increasing and decreasing gradually over a period of
 hours, rather than bursting.

 Most of our documents have upwards of twenty fields. We use different
 fields to store territory variant (we have around 30 territories) values
 and also boost based on the values in some of these fields (integer ones).

 So an average query can do a range filter by two of the territory variant
 fields, filter by a non-territory variant field. Facet by a field or two
 (may be territory variant). Bring back the values of 60 fields. Boost query
 on field values of a non-territory variant field. Boost by values of two
 territory-variant fields. Dismax query on up to 20 fields (with boosts) and
 phrase boost on those fields too. They're pretty big queries. We don't do
 any index-time boosting. We try to keep things dynamic so we can alter our
 boosts on-the-fly.

 Another common query is to list documents with a given set of IDs and
 select documents with a common reference and order them by one of their
 fields.

 Auto-commit every 30 minutes. Replication polls every 30 minutes.

 Document cache:
   * initialSize - 32768
   * size - 32768

 Filter cache:
   * autowarmCount - 

Re: requested url solr/update/extract not available on this server

2013-09-21 Thread Erick Erickson
bq: And im not using the example config file

It looks like you have not included the request handler in your solrconfig.xml,
something like (from the stock distro):

  !-- Solr Cell Update Request Handler

   http://wiki.apache.org/solr/ExtractingRequestHandler

--
  requestHandler name=/update/extract
  startup=lazy
  class=solr.extraction.ExtractingRequestHandler 
lst name=defaults
  str name=lowernamestrue/str
  str name=uprefixignored_/str

  !-- capture link hrefs but ignore div attributes --
  str name=captureAttrtrue/str
  str name=fmap.alinks/str
  str name=fmap.divignored_/str
/lst
  /requestHandler

I'd start with the stock config and try removing things one-by-one...

Best,
Erick

On Sat, Sep 21, 2013 at 7:34 AM, Nutan nutanshinde1...@gmail.com wrote:
 Yes I do get the solr admin page.And im not using the example config file,I
 have create mine own for my project as required.I have also defined
 update/extract in solrconfig.xml.


 On Tue, Sep 17, 2013 at 4:45 AM, Chris Hostetter-3 [via Lucene] 
 ml-node+s472066n409045...@n3.nabble.com wrote:


 : Is /solr/update working?

 more importantly: does /solr/ work in your browser and return anything
 useful?  (nothing you've told us yet gives us anyway of knowning if
 solr is even up and running)

 if 'http://localhost:8080/solr/' shows you the solr admin UI, and you are
 using the stock Solr 4.2 example configs, then
 http://localhost:8080/solr/update/extract should not give you a 404
 error.

 if however you are using some other configs, it might not work unless
 those configs register a handler with the path /update/extract.

 Using the jetty setup provided with 4.2, and the example configs (from
 4.2) I was able to index a sample PDF just fine using your curl command...

 hossman@frisbee:~/tmp$ curl 
 http://localhost:8983/solr/update/extract?literal.id=1commit=true; -F
 myfile=@stump.winners.san.diego.2013.pdf
 ?xml version=1.0 encoding=UTF-8?
 response
 lst name=responseHeaderint name=status0/intint
 name=QTime1839/int/lst
 /response





 :
 : Check solrconfig to see that /update/extract is configured as in the
 standard
 : Solr example.
 :
 : Does /solr/update/extract work for you using the standard Solr example?
 :
 : -- Jack Krupansky
 :
 : -Original Message- From: Nutan
 : Sent: Sunday, September 15, 2013 2:37 AM
 : To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4090459i=0
 : Subject: requested url solr/update/extract not available on this server
 :
 : I am working on Solr 4.2 on Windows 7. I am trying to index pdf files.I
 : referred Solr Cookbook 4. Tomcat is using 8080 port number. I get this
 : error:requested url solr/update/extract not available on this server
 : When my curl is :
 : curl http://localhost:8080/solr/update/extract?literal.id=1commit=true;
 -F
 : myfile=@cookbook.pdf
 : There is no entry in log files. Please help.
 :
 :
 :
 : --
 : View this message in context:
 :
 http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153.html
 : Sent from the Solr - User mailing list archive at Nabble.com.
 :

 -Hoss


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153p4090459.html
  To unsubscribe from requested url solr/update/extract not available on
 this server, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4090153code=bnV0YW5zaGluZGUxOTkyQGdtYWlsLmNvbXw0MDkwMTUzfC0xMzEzOTU5Mzcx
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/requested-url-solr-update-extract-not-available-on-this-server-tp4090153p4091371.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Getting a query parameter in a TokenFilter

2013-09-21 Thread Isaac Hebsh
Thought about that again,
We can do this work as a search component, manipulating the query string.
The cons are the double QParser work, and the double tokenization work.

Another approach which might solve this issue easily is Dynamic query
analyze chain: https://issues.apache.org/jira/browse/SOLR-5053

What would you do?


On Tue, Sep 17, 2013 at 10:31 PM, Isaac Hebsh isaac.he...@gmail.com wrote:

 Hi everyone,

 We developed a TokenFilter.
 It should act differently, depends on a parameter supplied in the
 query (for query chain only, not the index one, of course).
 We found no way to pass that parameter into the TokenFilter flow. I guess
 that the root cause is because TokenFilter is a pure lucene object.

 As a last resort, we tried to pass the parameter as the first term in the
 query text (q=...), and save it as a member of the TokenFilter instance.

 Although it is ugly, it might work fine.
 But, the problem is that it is not guaranteed that all the terms of a
 particular query will be analyzed by the same instance of a TokenFilter. In
 this case, some terms will be analyzed without the required information of
 that parameter. We can produce such a race very easily.

 How should I overcome this issue?
 Do anyone have a better resolution?



Re: ReplicationFactor for solrcloud

2013-09-21 Thread Aditya Sakhuja
Thanks Shalin. We used the maxShardsPerNode=3 as you suggest here.


On Thu, Sep 12, 2013 at 4:09 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 You must specify maxShardsPerNode=3 for this to happen. By default
 maxShardsPerNode defaults to 1 so only one shard is created per node.

 On Thu, Sep 12, 2013 at 3:19 AM, Aditya Sakhuja
 aditya.sakh...@gmail.com wrote:
  Hi -
 
  I am trying to set the 3 shards and 3 replicas for my solrcloud
 deployment
  with 3 servers, specifying the replicationFactor=3 and numShards=3 when
  starting the first node. I see each of the servers allocated to 1 shard
  each.however, do not see 3 replicas allocated on each node.
 
  I specifically need to have 3 replicas across 3 servers with 3 shards. Do
  we think of any reason to not have this configuration ?
 
  --
  Regards,
  -Aditya Sakhuja



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
-Aditya Sakhuja


isolating solrcloud instance from peer updates

2013-09-21 Thread Aditya Sakhuja
Hello all,

Is there a way to isolate an active solr-cloud instance from all incoming
replication update requests from peer nodes ?

-- 
Regards,
-Aditya Sakhuja


Re: Need help understanding the use cases behind core auto-discovery

2013-09-21 Thread Trey Grainger
While on this topic...

Is it still true in Solr 4.5 (RC) that it is not possible to have a shared
config directory?  In general, I like the new core.properties mechanism
better as it removes the unnecessary centralized configuration of cores in
solr.xml, but I have an infrastructure where I have thousands of Solr Cores
with the same configs on a single server, and as last I could tell with
Solr 4.4 the only way to support this in core.properties was to copy and
paste or create symbolic links for the whole conf/ folder for every core
(i.e. thousands of identical copies of all config files in my case).

In the old solr.xml format, we could set the instanceDir to have all cores
reference the same folder, but in core.properties there doesn't seem to be
anything like this.  I tried just referencing solrconfig.xml in another
directory, but because everything is now relative to the conf/ directory
under the folder containing core.properties, none of the referenced files
were in the right place.

Is there any better guidance on migrating to core autodiscovery with the
need for a shared config directory (non-SolrCloud mode)?  This looked
promising, but it sounds dead from Erick's JIRA comment:
https://issues.apache.org/jira/browse/SOLR-4478

Thanks,

-Trey


On Sat, Sep 21, 2013 at 2:25 PM, Erick Erickson erickerick...@gmail.comwrote:

 Also consider where SolrCloud is going. Trying to correctly maintain
 all the solr.xml files yourself on all the nodes would have
 been...interesting. On all the machines in your 200 node cluster.
 With 17 different collections. With nodes coming and going. With
 splitting shards. With.

 Collections are almost guaranteed to be distributed unevenly (e.g. a
 big collection might have 20 shards and a small collection 3 in the
 same cluster). So each node used to require solr.xml to be unique as
 far as everything in the cores tag. But everything  _not_ in the
 cores tags is common. Say you wanted to change the
 shardHandlerFactory (or any other setting we put in solr.xml that
 wouldn't have gone into the old cores tag). In the old-style way of
 doing things, since each solr.xml file on each node has potentially a
 different set of cores, you'd have to edit each and every one of them.

 The older way of doing this is fine as long as each solr.xml on each
 machine is self-consistent. So auto-discovery essentially automates
 that self-consistency.

 It also makes it possible to have Zookeeper manage your solr.xml and
 auto-distribute it to new nodes (or update existing) which would have
 taken a lot of effort to get right without auto-discovery. So changing
 the shardHandlerFactory consists of changing the solr.xml file and
 pushing it to ZooKeeper (don't quite remember the right JIRA, but you
 can do this now).

 I suppose it's like all other refactorings. Solr.xml had it's origin
 in the single-core days, then when multi-cores came into being it was
 expanded to include that information, but eventually became, as Yonik
 says, unnecessary central configuration which started becoming a
 limitation.

 FWIW,
 Erick

 On Fri, Sep 20, 2013 at 9:45 AM, Timothy Potter thelabd...@gmail.com
 wrote:
  Exactly the insight I was looking for! Thanks Yonik ;-)
 
 
  On Fri, Sep 20, 2013 at 10:37 AM, Yonik Seeley yo...@lucidworks.com
 wrote:
 
  On Fri, Sep 20, 2013 at 11:56 AM, Timothy Potter thelabd...@gmail.com
  wrote:
   Trying to add some information about core.properties and
 auto-discovery
  in
   Solr in Action and am at a loss for what to tell the reader is the
  purpose
   of this feature.
 
  IMO, it was more a removal of unnecessary central configuration.
  You previously had to list the core in solr.xml, and now you don't.
  Cores should be fully self-describing so that it should be easy to
  move them in the future just by moving the core directory (although
  that may not yet work...)
 
  -Yonik
  http://lucidworks.com
 
   Can anyone point me to any background information about core
   auto-discovery? I'm not interested in the technical implementation
  details.
   Mainly I'm trying to understand the motivation behind having this
 feature
   as it seems unnecessary with the Core Admin API. Best I can tell is it
   removes a manual step of firing off a call to the Core Admin API or
  loading
   a core from the Admin UI. If that's it and I'm overthinking it, then
 cool
   but was expecting more of an ah-ha moment with this feature ;-)
  
   Any insights you can share are appreciated.
  
   Thanks.
   Tim