Re: Czech stemmer

2014-09-10 Thread Lukáš Vlček
Hi,

I would recommend you to look at stemmer or token filter based on Hunspell
dictionaries. I am not a Solr user so I can not point you to appropriate
documentation about this but Czech dictionary that can be used with
Hunspell is of high quality. It can be downloaded from OpenOffice here
http://extensions.services.openoffice.org/en/project/czech-dictionary-pack-ceske-slovniky-cs-cz
(distributed under GPL).

Note: when I was looking at it the last time I noticed that the dictionary
contained one broken affix rule which may require manual fix depending on
how strict the rule loaded is in Solr. If you are interested in more
details and can not figure it yourself feel free to ping me again, I can
point you to some resources about how I used it in connection with
Elasticsearch, I assume the basic concepts apply to Solr as well.

Regards,
Lukas

2014-09-09 22:14 GMT+02:00 Shamik Bandopadhyay sham...@gmail.com:

 Hi,

   I'm facing stemming issues with the Czech language search. Solr/Lucene
 currently provides CzechStemFilterFactory as the sole option. Snowball
 Porter doesn't seem to be available for Czech. Here's the issue.

 I'm trying to search for posunout (means move in English) which returns
 result, but fails if I use ''posunulo (means moved in English). I used the
 following text as field for search.

 Pomocí multifunkčních uzlů je možné odkazy mnoha způsoby upravovat. Můžete
 přidat a odstranit odkazy, přidat a odstranit vrcholy, prodloužit nebo
 přesunout prodloužení čáry nebo přesunout text odkazu. Přístup k požadované
 možnosti získáte po přesunutí ukazatele myši na uzel. Z uzlu prodloužení
 čáry můžete zvolit tyto možnosti: Protáhnout: Umožňuje posunout prodloužení
 odkazové čáry. Délka prodloužení čáry: Umožňuje prodloužit prodloužení
 čáry. Přidat odkaz: Umožňuje přidat jednu nebo více odkazových čar. Z uzlu
 koncového bodu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
 posunout koncový bod odkazové čáry. Přidat vrchol: Umožňuje přidat vrchol k
 odkazové čáře. Odstranit odkaz: Umožňuje odstranit vybranou odkazovou čáru.
 Z uzlu vrcholu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
 posunout vrchol. Přidat vrchol: Umožňuje přidat vrchol na odkazovou čáru.
 Odstranit vrchol: Umožňuje odstranit vrchol. 

 Just wondering if there's a different stemmer available or a way to address
 this.

 Schema :

 fieldType name=text_csy class=solr.TextField
 positionIncrementGap=100 autoGeneratePhraseQueries=true 
 analyzer  type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=lang/stopwords_cz.txt /
 filter class=solr.SynonymFilterFactory synonyms=synonyms_csy.txt
 ignoreCase=true expand=true/
 filter class=solr.CzechStemFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=lang/stopwords_cz.txt /
 filter class=solr.CzechStemFilterFactory/
 /analyzer
 /fieldType

 Any pointers will be appreciated.

 - Thanks,
 Shamik



FileNotFoundException, Error closing IndexWriter, Error opening new searcher

2014-09-10 Thread Oliver Schrenk
Hi,

in the last few days  we had some troubles with one of our clusters (5 machines 
each running 4.7.2 inside jetty container, no replication, Java 1.7.21). Two 
time we had troubles to restart one server (same machine) because of some 
FileNotFoundException.


1. First time: Stopping Solr while indexing resulted in the following log 
output:

 2014-09-04 10:09:45,633 INFO o.a.s.s.SolrIndexSearcher 
[recoveryExecutor-6-thread-1] Opening Searcher@2b94db[shard2_replica1] realtime
 2014-09-04 10:09:45,634 INFO o.a.s.u.DirectUpdateHandler2 
[recoveryExecutor-6-thread-1] Reordered DBQs detected.  Update=add{...} 
DBQs=[...]
 2014-09-04 10:09:45,646 ERROR o.a.s.c.SolrException 
[recoveryExecutor-6-thread-1] Error opening realtime searcher for 
deleteByQuery:org.apache.solr.common.SolrException: Error opening new searcher
 at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521)
 at org.apache.solr.update.UpdateLog.add(UpdateLog.java:422)
 at 
org.apache.solr.update.DirectUpdateHandler2.addAndDelete(DirectUpdateHandler2.java:449)
 at 
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:216)
 at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
 at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
 at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
 at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704)
 at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858)
 at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557)
 at 
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
 at 
org.apache.solr.update.UpdateLog$LogReplayer.doReplay(UpdateLog.java:1326)
 at 
org.apache.solr.update.UpdateLog$LogReplayer.run(UpdateLog.java:1215)
 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)
 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
_7omin_Lucene41_0.tip
 at 
org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:252)
 at 
org.apache.lucene.index.TieredMergePolicy$SegmentByteSizeDescending.compare(TieredMergePolicy.java:238)
 at java.util.TimSort.binarySort(TimSort.java:265)
 at java.util.TimSort.sort(TimSort.java:208)
 at java.util.TimSort.sort(TimSort.java:173)
 at java.util.Arrays.sort(Arrays.java:659)
 at java.util.Collections.sort(Collections.java:217)
 at 
org.apache.lucene.index.TieredMergePolicy.findMerges(TieredMergePolicy.java:286)
 at 
org.apache.lucene.index.IndexWriter.updatePendingMerges(IndexWriter.java:1970)
 at 
org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1940)
 at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:404)
 at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:289)
 at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:274)
 at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:250)
 at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1445)
 ... 21 more

2. Second time: brought some updates to init.d scripts, had to restart each 
server on the cluster. No indexing at this time. Same server chrashed now with 
this output:

While shutting down

 2014-09-05 15:13:39,204 INFO o.a.s.c.c.ZkStateReader$2 [main-EventThread] A 
cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged 
path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 2014-09-05 15:13:39,585 INFO o.a.s.c.c.ZkStateReader$2 [main-EventThread] A 
cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged 
path:/clusterstate.json, has occurred - updating... (live nodes size: 5)
 2014-09-05 15:13:39,586 INFO o.a.s.c.c.ZkStateReader$2 [main-EventThread] A 
cluster state change: WatchedEvent state:SyncConnected 

Solr Spellcheck suggestions only return from /select handler when returning search results

2014-09-10 Thread Thomas Michael Engelke
 Hi,

I'm experimenting with the Spellcheck component and have therefor
used the example configuration for spell checking to try things out. My
solrconfig.xml looks like this:

 searchComponent name=spellcheck
class=solr.SpellCheckComponent
 str
name=queryAnalyzerFieldTypespell/str
 !-- Multiple Spell
Checkers can be declared and used by this
 component
 --
 !-- a
spellchecker built from a field of the main index --
 lst
name=spellchecker
 str name=namedefault/str
 str
name=fieldspell/str
 str
name=classnamesolr.DirectSolrSpellChecker/str
 !-- the spellcheck
distance measure used, the default is the internal levenshtein --
 str
name=distanceMeasureinternal/str
 !-- uncomment this to require
suggestions to occur in 1% of the documents
 float
name=thresholdTokenFrequency.01/float
 --
 /lst
 !-- a
spellchecker that can break or combine words. See /spell handler below
for usage --
 lst name=spellchecker
 str
name=namewordbreak/str
 str
name=classnamesolr.WordBreakSolrSpellChecker/str
 str
name=fieldspell/str
 str name=combineWordstrue/str
 str
name=breakWordstrue/str
 int name=maxChanges10/int
 /lst

/searchComponent

And I've added the spellcheck component to my
/select request handler:

 requestHandler name=/select
class=solr.SearchHandler
 ...
 arr name=last-components

strspellcheck/str
 /arr
 /requestHandler

I have built up the
spellchecker source in the schema.xml from the name field:

 field
name=spell type=spell indexed=true stored=true required=false
multiValued=false/
 copyField source=name dest=spell
maxChars=3 /
 ...
 fieldType name=spell class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 tokenizer
class=solr.StandardTokenizerFactory/
 /analyzer
 analyzer
type=query
 tokenizer class=solr.StandardTokenizerFactory/

/analyzer
 /fieldType

As I'm querying the /select request handler,
I should get spellcheck suggestions with my results. However, I rarely
get a suggestion. Examples:

query: Sichtscheibe, spellcheck suggestion:
Sichtscheiben (works)
query: Sichtscheib, spellcheck suggestion:
Sichtscheiben (works)
query: ichtscheiben, no spellcheck suggestions

As
far as I can identify, I only get suggestions when I get real search
results. I get results for the first 2 examples, because the german
StemFilterFactory translates Sichtscheibe and Sichtscheiben into
Sichtscheib, so there are matches found. However, the third query
should result in a suggestion, as the Levenshtein distance is less than
in the second example.

Suggestions, improvements, corrections?

 

Re: Integrate solr with openNLP

2014-09-10 Thread Aman Tandon
Hi,

What is the progress of integration of nlp with solr. If you have achieved
this integration techniques successfully then please share with us.

With Regards
Aman Tandon

On Tue, Jun 10, 2014 at 11:04 AM, Vivekanand Ittigi vi...@biginfolabs.com
wrote:

 Hi Aman,

 Yeah, We are also thinking the same. Using UIMA is better. And thanks to
 everyone. You guys really showed us the way(UIMA).

 We'll work on it.

 Thanks,
 Vivek


 On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi Vikek,
 
  As everybody in the mail list mentioned to use UIMA you should go for it,
  as opennlp issues are not tracking properly, it can make stuck your
  development in near future if any issue comes, so its better to start
  investigate with uima.
 
 
  With Regards
  Aman Tandon
 
 
  On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi 
 vi...@biginfolabs.com
  wrote:
 
   Can anyone pleas reply..?
  
   Thanks,
   Vivek
  
   -- Forwarded message --
   From: Vivekanand Ittigi vi...@biginfolabs.com
   Date: Wed, Jun 4, 2014 at 4:38 PM
   Subject: Re: Integrate solr with openNLP
   To: Tommaso Teofili tommaso.teof...@gmail.com
   Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org, Ahmet
   Arslan iori...@yahoo.com
  
  
   Hi Tommaso,
  
   Yes, you are right. 4.4 version will work.. I'm able to compile now.
 I'm
   trying to apply named recognition(person name) token but im not seeing
  any
   change. my schema.xml looks like this:
  
   field name=text type=text_opennlp_pos_ner indexed=true
  stored=true
   multiValued=true/
  
   fieldType name=text_opennlp_pos_ner class=solr.TextField
   positionIncrementGap=100
 analyzer
   tokenizer class=solr.OpenNLPTokenizerFactory
 tokenizerModel=opennlp/en-token.bin
   /
   filter class=solr.OpenNLPFilterFactory
 nerTaggerModels=opennlp/en-ner-person.bin
   /
   filter class=solr.LowerCaseFilterFactory/
 /analyzer
  
   /fieldType
  
   Please guide..?
  
   Thanks,
   Vivek
  
  
   On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili 
  tommaso.teof...@gmail.com
   
   wrote:
  
Hi all,
   
Ahment was suggesting to eventually use UIMA integration because
  OpenNLP
has already an integration with Apache UIMA and so you would just
 have
  to
use that [1].
And that's one of the main reason UIMA integration was done: it's a
framework that you can easily hook into in order to plug your NLP
   algorithm.
   
If you want to just use OpenNLP then it's up to you if either write
  your
own UpdateRequestProcessor plugin [2] to add metadata extracted by
   OpenNLP
to your documents or either you can write a dedicated analyzer /
   tokenizer
/ token filter.
   
For the OpenNLP integration (LUCENE-2899), the patch is not up to
 date
with the latest APIs in trunk, however you should be able to apply it
  to
(if I recall correctly) to 4.4 version or so, and also adapting it to
  the
latest API shouldn't be too hard.
   
Regards,
Tommaso
   
[1] :
   
  
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
[2] : http://wiki.apache.org/solr/UpdateRequestProcessor
   
   
   
2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid:
   
Can you extract names, locations etc using OpenNLP in plain/straight
  java
program?
   
If yes, here are two seperate options :
   
1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
example to integrate your NER code into it and write your own
 indexing
code. You have the full power here. No solr-plugins are involved.
   
2) Use 'Implementing a conditional copyField' given here :
http://wiki.apache.org/solr/UpdateRequestProcessor
as an example and integrate your NER code into it.
   
   
Please note that these are separate ways to enrich your incoming
documents, choose either (1) or (2).
   
   
   
On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
vi...@biginfolabs.com wrote:
Okay, but i dint understand what you said. Can you please elaborate.
   
Thanks,
Vivek
   
   
   
   
   
On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com
  wrote:
   
 Hi Vivekanand,

 I have never use UIMA+Solr before.

 Personally I think it takes more time to learn how to
 configure/use
these
 uima stuff.


 If you are familiar with java, write a class that extends
 UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these
 new
fields
 (organisation, city, person name, etc, to your document. This
 phase
  is
 usually called 'enrichment'.

 Does that makes sense?



 On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi 
vi...@biginfolabs.com
 wrote:
 Hi Ahmet,

 I followed what you said
 

Genre classification/Document classification for apache solr

2014-09-10 Thread vineet yadav
Hi,
I want to crawl links and want to identify if link is company website. For
example, If I use word 'financial advisory' in google search engine. I will
get list of urls in search result. Some of links are company website. I
want to identify those links which are company website and index them into
solr. Does any body know some api/tools which can identify if link is
company website or not, or api/tool which can identify url genre/type on
the basis of taxonomy.

Thanks
Vineet Yadav


Re: Edismax mm and efficiency

2014-09-10 Thread Peter Keegan
I implemented a custom QueryComponent that issues the edismax query with
mm=100%, and if no results are found, it reissues the query with mm=1. This
doubled our query throughput (compared to mm=1 always), as we do some
expensive RankQuery processing. For your very long student queries, mm=100%
would obviously be too high, so you'd have to experiment.

On Fri, Sep 5, 2014 at 1:34 PM, Walter Underwood wun...@wunderwood.org
wrote:

 Great!

 We have some very long queries, where students paste entire homework
 problems. One of them was 1051 words. Many of them are over 100 words. This
 could help.

 In the Jira discussion, I saw some comments about handling the most sparse
 lists first. We did something like that in the Infoseek Ultra engine about
 twenty years ago. Short termlists (documents matching a term) were
 processed first, which kept the in-memory lists of matching docs small. It
 also allowed early short-circuiting for no-hits queries.

 What would be a high mm value, 75%?

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/


 On Sep 4, 2014, at 11:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com
 wrote:

  indeed https://issues.apache.org/jira/browse/LUCENE-4571
  my feeling is it gives a significant gain in mm high values.
 
 
 
  On Fri, Sep 5, 2014 at 3:01 AM, Walter Underwood wun...@wunderwood.org
  wrote:
 
  Are there any speed advantages to using “mm”? I can imagine pruning the
  set of matching documents early, which could help, but is that (or
  something else) done?
 
  wunder
  Walter Underwood
  wun...@wunderwood.org
  http://observer.wunderwood.org/
 
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
  http://www.griddynamics.com
  mkhlud...@griddynamics.com




Re: Integrate solr with openNLP

2014-09-10 Thread Vivekanand Ittigi
Actually we dropped integrating nlp with solr but we took two different
ideas:

* we're using nlp seperately not with solr
* we're taking help of UIMA for solr. Its more advanced.

If you've a specific question. you can ask me. I'll tell you if i know.

-Vivek

On Wed, Sep 10, 2014 at 3:46 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi,

 What is the progress of integration of nlp with solr. If you have achieved
 this integration techniques successfully then please share with us.

 With Regards
 Aman Tandon

 On Tue, Jun 10, 2014 at 11:04 AM, Vivekanand Ittigi vi...@biginfolabs.com
 
 wrote:

  Hi Aman,
 
  Yeah, We are also thinking the same. Using UIMA is better. And thanks to
  everyone. You guys really showed us the way(UIMA).
 
  We'll work on it.
 
  Thanks,
  Vivek
 
 
  On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
   Hi Vikek,
  
   As everybody in the mail list mentioned to use UIMA you should go for
 it,
   as opennlp issues are not tracking properly, it can make stuck your
   development in near future if any issue comes, so its better to start
   investigate with uima.
  
  
   With Regards
   Aman Tandon
  
  
   On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi 
  vi...@biginfolabs.com
   wrote:
  
Can anyone pleas reply..?
   
Thanks,
Vivek
   
-- Forwarded message --
From: Vivekanand Ittigi vi...@biginfolabs.com
Date: Wed, Jun 4, 2014 at 4:38 PM
Subject: Re: Integrate solr with openNLP
To: Tommaso Teofili tommaso.teof...@gmail.com
Cc: solr-user@lucene.apache.org solr-user@lucene.apache.org,
 Ahmet
Arslan iori...@yahoo.com
   
   
Hi Tommaso,
   
Yes, you are right. 4.4 version will work.. I'm able to compile now.
  I'm
trying to apply named recognition(person name) token but im not
 seeing
   any
change. my schema.xml looks like this:
   
field name=text type=text_opennlp_pos_ner indexed=true
   stored=true
multiValued=true/
   
fieldType name=text_opennlp_pos_ner class=solr.TextField
positionIncrementGap=100
  analyzer
tokenizer class=solr.OpenNLPTokenizerFactory
  tokenizerModel=opennlp/en-token.bin
/
filter class=solr.OpenNLPFilterFactory
  nerTaggerModels=opennlp/en-ner-person.bin
/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
   
/fieldType
   
Please guide..?
   
Thanks,
Vivek
   
   
On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili 
   tommaso.teof...@gmail.com

wrote:
   
 Hi all,

 Ahment was suggesting to eventually use UIMA integration because
   OpenNLP
 has already an integration with Apache UIMA and so you would just
  have
   to
 use that [1].
 And that's one of the main reason UIMA integration was done: it's a
 framework that you can easily hook into in order to plug your NLP
algorithm.

 If you want to just use OpenNLP then it's up to you if either write
   your
 own UpdateRequestProcessor plugin [2] to add metadata extracted by
OpenNLP
 to your documents or either you can write a dedicated analyzer /
tokenizer
 / token filter.

 For the OpenNLP integration (LUCENE-2899), the patch is not up to
  date
 with the latest APIs in trunk, however you should be able to apply
 it
   to
 (if I recall correctly) to 4.4 version or so, and also adapting it
 to
   the
 latest API shouldn't be too hard.

 Regards,
 Tommaso

 [1] :

   
  
 
 http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
 [2] : http://wiki.apache.org/solr/UpdateRequestProcessor



 2014-06-03 15:34 GMT+02:00 Ahmet Arslan iori...@yahoo.com.invalid
 :

 Can you extract names, locations etc using OpenNLP in
 plain/straight
   java
 program?

 If yes, here are two seperate options :

 1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
 example to integrate your NER code into it and write your own
  indexing
 code. You have the full power here. No solr-plugins are involved.

 2) Use 'Implementing a conditional copyField' given here :
 http://wiki.apache.org/solr/UpdateRequestProcessor
 as an example and integrate your NER code into it.


 Please note that these are separate ways to enrich your incoming
 documents, choose either (1) or (2).



 On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi 
 vi...@biginfolabs.com wrote:
 Okay, but i dint understand what you said. Can you please
 elaborate.

 Thanks,
 Vivek





 On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan iori...@yahoo.com
   wrote:

  Hi Vivekanand,
 
  I have never use UIMA+Solr before.
 
  Personally I think it takes more time to learn how to
  configure/use
 these
  

Installing solr on tomcat 7.x | Window 8

2014-09-10 Thread Umesh Awasthi
I am trying to follow official document as well other resource available on
the net but unable to run solr on my tomcat.

I am trying to install and run `solr-4.10.0` on tomcat. this is what I have
done so far

 1. Copy solr-4.10.0.war to tomcat web-app and renamed it to solr.war.
 2. Created a folder in my `D` drive with name `solr-home`.
 3. copied everything from `solr-4.10.0\example\solr` and pasted it in
solr-home` folder.
 4. Through Environment variable , under user variable, I set following
path  `solr.solr.home=D:\solr-home`

Started tomcat server, it is starting without any error / exception, but
when I try to hit following URL `http://localhost:8080/solr/`, I am getting
following error

`message {msg=SolrCore 'collection1' is not available due to init failure:
Could not load conf for core collection1: Error loading solr config from
solr/collection1\solrconfig.xml,trace=org.apache.solr.common.SolrException:
SolrCore 'collection1' is not available due to init failure: Could not load
conf for core collection1: Error loading solr config from
solr/collection1\solrconfig.xml at
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:307)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at
java.lang.Thread.run(Unknown Source) Caused by:
org.apache.solr.common.SolrException: Could not load conf for core
collection1: Error loading solr config from solr/collection1\solrconfig.xml
at
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489) at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255) at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at
java.util.concurrent.FutureTask.run(Unknown Source) ... 3 more Caused by:
org.apache.solr.common.SolrException: Error loading solr config from
solr/collection1\solrconfig.xml at
org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:148)
at
org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
at
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61)
... 7 more Caused by: java.io.IOException: Can't find resource
'solrconfig.xml' in classpath or 'C:\Program Files\Apache Software
Foundation\Tomcat 7.0\solr\collection1\conf' at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:362)
at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:308)
at org.apache.solr.core.Config.init(Config.java:116) at
org.apache.solr.core.Config.init(Config.java:86) at
org.apache.solr.core.SolrConfig.init(SolrConfig.java:161) at
org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:144)
... 9 more ,code=500}`

I am not sure where I am doing wrong or is it so tricky to install `Solr`.

-- 
With Regards
Umesh Awasthi
http://www.travellingrants.com/


Problem while extending TokenizerFactory in Solr 4.4.0

2014-09-10 Thread Francesco Valentini
Hi All,



I’m using Solr 4.4.0 distro and now,  I have a strange issue while
extending  TokenizerFactory with a custom class.

This is an excerpt of  pom I use:





properties

solr.version4.4.0/solr.version

/properties



dependency

   groupIdorg.apache.lucene/groupId

   artifactId*lucene*-core/artifactId

   version${solr.version}/version

/dependency



dependency

groupIdorg.apache.lucene/groupId

   artifactId*lucene*-analyzers-common/artifactId

   version${solr.version}/version

/dependency

dependency

   groupIdorg.apache.lucene/groupId

   artifactId*lucene*-*queryparser*/artifactId

   version${solr.version}/version

/dependency

dependency

groupIdorg.apache.solr/groupId

   artifactId*solr*-core/artifactId

   version${solr.version}/version

/dependency



I always get the exception below during solr engine initialization:



com.mytest.tokenizer.RelationChunkTokenizerFactory'

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)

at
org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:467)

at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:164)

at
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)

at
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)

at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:619)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)

at java.lang.Thread.run(Unknown Source)

Caused by: org.apache.solr.common.SolrException: Plugin init failure for
[schema.xml] analyzer/tokenizer: Error instantiating class:
'com.mytest.tokenizer.RelationChunkTokenizerFactory'

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)

at
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:362)

at
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)

at
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

... 14 more

Caused by: org.apache.solr.common.SolrException: Error instantiating class:
'com.mytest.tokenizer.RelationChunkTokenizerFactory'

at
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:556)

at
org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:342)

at
org.apache.solr.schema.FieldTypePluginLoader$2.create(FieldTypePluginLoader.java:335)

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

... 18 more

Caused by: java.lang.NoSuchMethodException:
com.mytest.tokenizer.RelationChunkTokenizerFactory.init(java.util.Map)

at java.lang.Class.getConstructor0(Unknown Source)

at java.lang.Class.getConstructor(Unknown Source)

at
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:552)

... 21 more

8604 [coreLoadExecutor-3-thread-1] ERROR
org.apache.solr.core.CoreContainer  û
null:org.apache.solr.common.SolrException: Unable to create core:
collection1

at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1150)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:666)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)

at java.util.concurrent.FutureTask.run(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)

at java.lang.Thread.run(Unknown Source)

Caused by: org.apache.solr.common.SolrException: Plugin init failure for
[schema.xml] fieldType rel: Plugin init failure for [schema.xml]
analyzer/tokenizer: Error instantiating class: 'com.altilia.

platform.tokenizer.RelationChunkTokenizerFactory'

at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)

at

RE: Solr Spellcheck suggestions only return from /select handler when returning search results

2014-09-10 Thread Dyer, James
Thomas,

It looks like you've set things up correctly in that while the user is 
searching against a stemmed field (name), spellcheck is checking against a 
lightly-analyzed copy of it (spell).  This is the right way to do it as 
spellcheck against stemmed forms is usually undesirable.

But as you've experienced, you will sometimes get results (due to stemming) and 
also suggestions (because the spellechecker is looking at unstemmed forms).  If 
you do not want spellcheck to return anything when you get results, you can set 
spellcheck.maxResultsForSuggest=0.

Now keeping in mind we're comparing unstemmed forms, can you verify you indeed 
have something in your index that is within 2 edits of ichtscheiben ?  My 
guess is you probably don't, which would be why you do not get spelling results 
in that case.

Also, even if you do have something within 2 edits, if ichtscheiben occurs in 
your index, by default it won't try to correct it at all (even if the query 
returns nothing, maybe because of filters or other required terms on the 
query).  In this case you need to set spellcheck.alternativeTermCount to a 
non-zero value (try maybe 5).

See 
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount 
and following sections.

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] 
Sent: Wednesday, September 10, 2014 5:00 AM
To: Solr user
Subject: Solr Spellcheck suggestions only return from /select handler when 
returning search results

 Hi,

I'm experimenting with the Spellcheck component and have therefor
used the example configuration for spell checking to try things out. My
solrconfig.xml looks like this:

 searchComponent name=spellcheck
class=solr.SpellCheckComponent
 str
name=queryAnalyzerFieldTypespell/str
 !-- Multiple Spell
Checkers can be declared and used by this
 component
 --
 !-- a
spellchecker built from a field of the main index --
 lst
name=spellchecker
 str name=namedefault/str
 str
name=fieldspell/str
 str
name=classnamesolr.DirectSolrSpellChecker/str
 !-- the spellcheck
distance measure used, the default is the internal levenshtein --
 str
name=distanceMeasureinternal/str
 !-- uncomment this to require
suggestions to occur in 1% of the documents
 float
name=thresholdTokenFrequency.01/float
 --
 /lst
 !-- a
spellchecker that can break or combine words. See /spell handler below
for usage --
 lst name=spellchecker
 str
name=namewordbreak/str
 str
name=classnamesolr.WordBreakSolrSpellChecker/str
 str
name=fieldspell/str
 str name=combineWordstrue/str
 str
name=breakWordstrue/str
 int name=maxChanges10/int
 /lst

/searchComponent

And I've added the spellcheck component to my
/select request handler:

 requestHandler name=/select
class=solr.SearchHandler
 ...
 arr name=last-components

strspellcheck/str
 /arr
 /requestHandler

I have built up the
spellchecker source in the schema.xml from the name field:

 field
name=spell type=spell indexed=true stored=true required=false
multiValued=false/
 copyField source=name dest=spell
maxChars=3 /
 ...
 fieldType name=spell class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 tokenizer
class=solr.StandardTokenizerFactory/
 /analyzer
 analyzer
type=query
 tokenizer class=solr.StandardTokenizerFactory/

/analyzer
 /fieldType

As I'm querying the /select request handler,
I should get spellcheck suggestions with my results. However, I rarely
get a suggestion. Examples:

query: Sichtscheibe, spellcheck suggestion:
Sichtscheiben (works)
query: Sichtscheib, spellcheck suggestion:
Sichtscheiben (works)
query: ichtscheiben, no spellcheck suggestions

As
far as I can identify, I only get suggestions when I get real search
results. I get results for the first 2 examples, because the german
StemFilterFactory translates Sichtscheibe and Sichtscheiben into
Sichtscheib, so there are matches found. However, the third query
should result in a suggestion, as the Levenshtein distance is less than
in the second example.

Suggestions, improvements, corrections?

 


Re: Problem while extending TokenizerFactory in Solr 4.4.0

2014-09-10 Thread Shawn Heisey
On 9/10/2014 7:14 AM, Francesco Valentini wrote:
 I’m using Solr 4.4.0 distro and now,  I have a strange issue while
 extending  TokenizerFactory with a custom class.

I think what we have here is a basic Java error, nothing specific to
Solr.  This jumps out at me:

Caused by: java.lang.NoSuchMethodException:
com.mytest.tokenizer.RelationChunkTokenizerFactory.init(java.util.Map)
at java.lang.Class.getConstructor0(Unknown Source)
at java.lang.Class.getConstructor(Unknown Source)
at
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:552)
... 21 more

Java is trying to execute a method that doesn't exist.  The
getConstructor pieces after the message suggest that perhaps it's a
constructor with a Map as an argument, but I'm not familiar enough with
this error to know whether it's trying to run a constructor that doesn't
exist, or whether it's trying to actually use a method called init.

The constructor in TokenizerFactory is protected, and all of the
existing descendants that I looked at have a public constructor ... this
message would make sense in all of the following situations:

1) You didn't create a constructor for your object with a Map argument.
2) You made your constructor protected.
3) You made your constructor private.

Thanks,
Shawn



Re: Installing solr on tomcat 7.x | Window 8

2014-09-10 Thread Shawn Heisey
On 9/10/2014 6:45 AM, Umesh Awasthi wrote:
 I am trying to follow official document as well other resource available on
 the net but unable to run solr on my tomcat.
 
 I am trying to install and run `solr-4.10.0` on tomcat. this is what I have
 done so far
 
  1. Copy solr-4.10.0.war to tomcat web-app and renamed it to solr.war.
  2. Created a folder in my `D` drive with name `solr-home`.
  3. copied everything from `solr-4.10.0\example\solr` and pasted it in
 solr-home` folder.
  4. Through Environment variable , under user variable, I set following
 path  `solr.solr.home=D:\solr-home`
 
 Started tomcat server, it is starting without any error / exception, but
 when I try to hit following URL `http://localhost:8080/solr/`, I am getting
 following error
 
 `message {msg=SolrCore 'collection1' is not available due to init failure:
 Could not load conf for core collection1: Error loading solr config from
 solr/collection1\solrconfig.xml,trace=org.apache.solr.common.SolrException:
 SolrCore 'collection1' is not available due to init failure: Could not load
 conf for core collection1: Error loading solr config from
 solr/collection1\solrconfig.xml at

The path in the error message is wrong.  See SOLR-5814.

https://issues.apache.org/jira/browse/SOLR-5814

Is there anything in the log before this message?  You will need to find
the actual solr logfile ... because you used tomcat and not the jetty
included with Solr, I cannot tell you where this logfile is, although if
you copied the logging jars and the logging config from the example, it
will be logs\solr.log, relative to the current working directory of the
process that started tomcat.

A side note:  Windows 8 is a client operating system.  Microsoft has
crippled their client operating systems in some way compared to their
server operating systems.  Heavy multi-threaded server workloads like
Solr will not work as well.  I don't know exactly what the differences are.

If you really want to run Solr on a Windows system, you really should
put it on Server 2012, not Windows 8.  Unfortunately, their server
operating systems have a rather high price tag.  It will be the opinion
of most people here that you and your pocketbook would be far happier
with the results of running on Linux -- better performance and no cost.

Thanks,
Shawn



Re: Problem while extending TokenizerFactory in Solr 4.4.0

2014-09-10 Thread Francesco Valentini
Hi Shawn,
thank you very much for your quick anwser,
I fixed it.

Thanks
Francesco

2014-09-10 15:34 GMT+02:00 Shawn Heisey s...@elyograg.org:

 On 9/10/2014 7:14 AM, Francesco Valentini wrote:
  I’m using Solr 4.4.0 distro and now,  I have a strange issue while
  extending  TokenizerFactory with a custom class.

 I think what we have here is a basic Java error, nothing specific to
 Solr.  This jumps out at me:

 Caused by: java.lang.NoSuchMethodException:
 com.mytest.tokenizer.RelationChunkTokenizerFactory.init(java.util.Map)
 at java.lang.Class.getConstructor0(Unknown Source)
 at java.lang.Class.getConstructor(Unknown Source)
 at

 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:552)
 ... 21 more

 Java is trying to execute a method that doesn't exist.  The
 getConstructor pieces after the message suggest that perhaps it's a
 constructor with a Map as an argument, but I'm not familiar enough with
 this error to know whether it's trying to run a constructor that doesn't
 exist, or whether it's trying to actually use a method called init.

 The constructor in TokenizerFactory is protected, and all of the
 existing descendants that I looked at have a public constructor ... this
 message would make sense in all of the following situations:

 1) You didn't create a constructor for your object with a Map argument.
 2) You made your constructor protected.
 3) You made your constructor private.

 Thanks,
 Shawn




Modify Schema - Schema API

2014-09-10 Thread Joseph Obernberger
In addition to adding new fields to the schema, is there a way to modify an
existing field?  If I created a field called userID as a long, but decided
later that it should be a string?
Thank you!

-Joe


Re: Edismax mm and efficiency

2014-09-10 Thread Walter Underwood
We do that strict/loose query sequence, but on the client side with two 
requests. Would you consider contributing the QueryComponent?

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Sep 10, 2014, at 3:47 AM, Peter Keegan peterlkee...@gmail.com wrote:

 I implemented a custom QueryComponent that issues the edismax query with
 mm=100%, and if no results are found, it reissues the query with mm=1. This
 doubled our query throughput (compared to mm=1 always), as we do some
 expensive RankQuery processing. For your very long student queries, mm=100%
 would obviously be too high, so you'd have to experiment.
 
 On Fri, Sep 5, 2014 at 1:34 PM, Walter Underwood wun...@wunderwood.org
 wrote:
 
 Great!
 
 We have some very long queries, where students paste entire homework
 problems. One of them was 1051 words. Many of them are over 100 words. This
 could help.
 
 In the Jira discussion, I saw some comments about handling the most sparse
 lists first. We did something like that in the Infoseek Ultra engine about
 twenty years ago. Short termlists (documents matching a term) were
 processed first, which kept the in-memory lists of matching docs small. It
 also allowed early short-circuiting for no-hits queries.
 
 What would be a high mm value, 75%?
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/
 
 
 On Sep 4, 2014, at 11:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com
 wrote:
 
 indeed https://issues.apache.org/jira/browse/LUCENE-4571
 my feeling is it gives a significant gain in mm high values.
 
 
 
 On Fri, Sep 5, 2014 at 3:01 AM, Walter Underwood wun...@wunderwood.org
 wrote:
 
 Are there any speed advantages to using “mm”? I can imagine pruning the
 set of matching documents early, which could help, but is that (or
 something else) done?
 
 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/
 
 
 
 
 
 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics
 
 http://www.griddynamics.com
 mkhlud...@griddynamics.com
 
 



Re: Modify Schema - Schema API

2014-09-10 Thread Anshum Gupta
Hi Joseph,

It isn't supported by an exiting REST API (if that was your question) but
you can always edit the schema manually (if it isn't managed), upload the
new schema and reload the collections (or cores in case of non-SolrCloud
mode).

Do remember that changing the field type might require you to reindex your
data.

There's an open JIRA for that one and I think someone would get to it
sometime in the reasonably near future.
JIRA: https://issues.apache.org/jira/browse/SOLR-5289

On Wed, Sep 10, 2014 at 8:05 AM, Joseph Obernberger 
joseph.obernber...@gmail.com wrote:

 In addition to adding new fields to the schema, is there a way to modify an
 existing field?  If I created a field called userID as a long, but decided
 later that it should be a string?
 Thank you!

 -Joe




-- 

Anshum Gupta
http://www.anshumgupta.net


Re: Modify Schema - Schema API

2014-09-10 Thread Joseph Obernberger
Thank you - yes that was my question.  I should have stated that it was for
SolrCloud and hence a managed schema.  Could I bring down the shards, edit
the managed schema on zookeeper, fire the shards back up and re-index?

-Joe

On Wed, Sep 10, 2014 at 11:50 AM, Anshum Gupta ans...@anshumgupta.net
wrote:

 Hi Joseph,

 It isn't supported by an exiting REST API (if that was your question) but
 you can always edit the schema manually (if it isn't managed), upload the
 new schema and reload the collections (or cores in case of non-SolrCloud
 mode).

 Do remember that changing the field type might require you to reindex your
 data.

 There's an open JIRA for that one and I think someone would get to it
 sometime in the reasonably near future.
 JIRA: https://issues.apache.org/jira/browse/SOLR-5289

 On Wed, Sep 10, 2014 at 8:05 AM, Joseph Obernberger 
 joseph.obernber...@gmail.com wrote:

  In addition to adding new fields to the schema, is there a way to modify
 an
  existing field?  If I created a field called userID as a long, but
 decided
  later that it should be a string?
  Thank you!
 
  -Joe
 



 --

 Anshum Gupta
 http://www.anshumgupta.net



Problems for indexing large documents on SolrCloud

2014-09-10 Thread Olivier
Hi,

I have some problems for indexing large documents in a SolrCloud cluster of
3 servers  (Solr 4.8.1) with 3 shards and 2 replicas for each shard on
Tomcat 7.
For a specific document (with 300 K values in a  multivalued field), I
couldn't index it on SolrCloud but I could do it in a single instance of
Solr on my own PC.

The indexation is done with Solarium from a database. The data indexed are
e-commerce products with classic fields like name, price, description,
instock, etc... The large field (type int) is constitued of other products
ids.
The only difference with other documents well-indexed on Solr  is the size
of that multivalued field. Indeed, other documents well-indexed have all
between 100K values and 200 K values for that field.
The index size is 11 Mb for 20 documents.

To solve it, I tried to change several parameters including ZKTimeout in
solr.xml  :

In solrcloud section :

int name=zkClientTimeout6/int

int name=distribUpdateConnTimeout10/int

int name=distribUpdateSoTimeout10/int



 In shardHandlerFactory section  :



int name=socketTimeout${socketTimeout:10}/int

int name=connTimeout${connTimeout:10}/int


I also tried to increase these values in solrconfig.xml :

requestParsers enableRemoteStreaming=true

multipartUploadLimitInKB=1

formdataUploadLimitInKB=10

addHttpRequestToContext=false/




I also tried to increase the quantity of RAM (there are VMs) : each server
has 4 Gb of RAM with 3Gb for the JVM.

Are there other settings which can solve the problem that I would have
forgotten ?


The error messages are :

ERROR

SolrDispatchFilter

null:java.lang.RuntimeException: [was class java.net.SocketException]
Connection reset

ERROR

SolrDispatchFilter

null:ClientAbortException:

java.net.SocketException:
broken pipe

ERROR

SolrDispatchFilter

null:ClientAbortException:

java.net.SocketException:
broken pipe

ERROR

SolrCore

org.apache.solr.common.SolrException:
  Unexpected end of input
block; expected an identifier

ERROR

SolrCore

org.apache.solr.common.SolrException:
  Unexpected end of input
block; expected an identifier

ERROR

SolrCore

org.apache.solr.common.SolrException:
  Unexpected end of input
block; expected an identifier

ERROR

SolrCore

org.apache.solr.common.SolrException:
  Unexpected EOF in
attribute value








Thanks,

Olivier

SolrCore

org.apache.solr.common.SolrException:
  Unexpected end of input
block in start tag


Re: Wildcard in FL parameter not working with Solr 4.10.0

2014-09-10 Thread Mike Hugo
This may have been introduced by changes made to solve
https://issues.apache.org/jira/browse/SOLR-5968

I created https://issues.apache.org/jira/browse/SOLR-6501 to track the new
bug.

On Tue, Sep 9, 2014 at 4:53 PM, Mike Hugo m...@piragua.com wrote:

 Hello,

 With Solr 4.7 we had some queries that return dynamic fields by passing in
 a fl=*_exact parameter; this is not working for us after upgrading to Solr
 4.10.0.  This appears to only be a problem when requesting wildcarded
 fields via SolrJ

 With Solr 4.10.0 - I downloaded the binary and set up the example:

 cd example
 java -jar start.jar
 java -jar post.jar solr.xml monitor.xml

 In a browser, if I request

 http://localhost:8983/solr/collection1/select?q=*:*wt=jsonindent=true
 *fl=*d*

 All is well with the world:

 {responseHeader: {status: 0,QTime: 1,params: {fl: *d,indent
 : true,q: *:*,wt: json}},response: {numFound: 2,start: 0,
 docs: [{id: SOLR1000},{id: 3007WFP}]}}

 However if I do the same query with SolrJ (groovy script)


 @Grab(group = 'org.apache.solr', module = 'solr-solrj', version = '4.10.0')

 import org.apache.solr.client.solrj.SolrQuery
 import org.apache.solr.client.solrj.impl.HttpSolrServer

 HttpSolrServer solrServer = new HttpSolrServer(
 http://localhost:8983/solr/collection1;)
 SolrQuery q = new SolrQuery(*:*)
 *q.setFields(*d)*
 println solrServer.query(q)


 No fields are returned:


 {responseHeader={status=0,QTime=0,params={fl=*d,q=*:*,wt=javabin,version=2}},response={numFound=2,start=0,docs=[*SolrDocument{},
 SolrDocument{}*]}}



 Any ideas as to why when using SolrJ wildcarded fl fields are not returned?

 Thanks,

 Mike



Re: Modify Schema - Schema API

2014-09-10 Thread Anshum Gupta
You don't need to bring down the shards/collections, instead here's what
you can do:
* Retain the filename (managed_schema, if you didn't change the default
resource name).
* Edit the file locally
* Upload it to replace the current zk file.
* Reload the collection(s).
* Reindex

Here's another thing you can do:
* Upload the updated configs to zk
* Create a new collection (different name) using the new configs
* Reindex data to the new collection.
* Use collection aliasing to swap the old/new collections.
(http://www.anshumgupta.net/2013/10/collection-aliasing-in-solrcloud.html)

All this while, you wouldn't really need to shut down the Solr
cluster/collection etc.

On Wed, Sep 10, 2014 at 8:56 AM, Joseph Obernberger 
joseph.obernber...@gmail.com wrote:

 Thank you - yes that was my question.  I should have stated that it was for
 SolrCloud and hence a managed schema.  Could I bring down the shards, edit
 the managed schema on zookeeper, fire the shards back up and re-index?

 -Joe

 On Wed, Sep 10, 2014 at 11:50 AM, Anshum Gupta ans...@anshumgupta.net
 wrote:

  Hi Joseph,
 
  It isn't supported by an exiting REST API (if that was your question) but
  you can always edit the schema manually (if it isn't managed), upload the
  new schema and reload the collections (or cores in case of non-SolrCloud
  mode).
 
  Do remember that changing the field type might require you to reindex
 your
  data.
 
  There's an open JIRA for that one and I think someone would get to it
  sometime in the reasonably near future.
  JIRA: https://issues.apache.org/jira/browse/SOLR-5289
 
  On Wed, Sep 10, 2014 at 8:05 AM, Joseph Obernberger 
  joseph.obernber...@gmail.com wrote:
 
   In addition to adding new fields to the schema, is there a way to
 modify
  an
   existing field?  If I created a field called userID as a long, but
  decided
   later that it should be a string?
   Thank you!
  
   -Joe
  
 
 
 
  --
 
  Anshum Gupta
  http://www.anshumgupta.net
 




-- 

Anshum Gupta
http://www.anshumgupta.net


RE: [Announce] Apache Solr 4.10 with RankingAlgorithm 1.5.4 available now with complex-lsa algorithm (simulates human language acquisition and recognition)

2014-09-10 Thread nnagarajayya
Hi Deigo:

Not sure of solr-sense, but complex-lsa is an enhanced lsa implementation with 
TERM-DOCUMENT Similarity, etc. (not found in lsa).  The relevance/ranking is 
again different and is more accurate as it uses the RankingAlgorithm scoring 
model. The query performance gain with this version is significant from the 
last release, a TERM-SIMILARITY query that used to take about 8-9 seconds now 
takes just 30ms to 80ms. Lot of performance improvements ...

Warm Regards,

Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://elasticsearch-ra.tgels.org
 http://rankingalgorithm.tgels.org
(accurate and relevant, simulates human language acquisition and recognition)

-Original Message-
From: Diego Fernandez [mailto:difer...@redhat.com] 
Sent: Tuesday, September 9, 2014 10:38 AM
To: solr-user@lucene.apache.org
Cc: gene...@lucene.apache.org
Subject: Re: [Announce] Apache Solr 4.10 with RankingAlgorithm 1.5.4 available 
now with complex-lsa algorithm (simulates human language acquisition and 
recognition)

Interesting.  Does anyone know how that compares to this 
http://www.searchbox.com/products/searchbox-plugins/solr-sense/?

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


- Original Message -
 Hi!
 
 I am very excited to announce the availability of Apache Solr 4.10 
 with RankingAlgorithm 1.5.4.
 
 Solr 4.10.0 with RankingAlgorithm 1.5.4 includes support for complex-lsa.
 complex-lsa simulates human language acquisition and recognition (see 
 demo http://solr-ra.tgels.org/rankingsearchlsa.jsp ) and can 
 retrieve semantically related/hidden relationships between terms, 
 sentences, paragraphs, chapters, books, images, etc. Three new 
 similarities, TERM_SIMILARITY, DOCUMENT_SIMILARITY, 
 TERM_DOCUMENT_SIMILARITY enable these with improved precision.  A 
 query for “holy AND ghost” returns jesus/christ as the top results for 
 the bible corpus with no effort to introduce this relationship (see demo 
 http://solr-ra.tgels.org/rankingsearchlsa.jsp ).
 
  
 
 This version adds support for  multiple linear algebra libraries.
 complex-lsa does a large amount of this calcs so speeding this up 
 should speed up the retrieval etc. EJML is the fastest if you are 
 using complex-lsa for a smaller set of documents, while MTJ is faster 
 as your document collection becomes bigger. MTJ can also use 
 BLAS/LAPACK, etc installed on your system to further improve 
 performance with native execution. The performance is similar to a 
 C/C++ application. It can also make use of GPUs or Intel's mkl library if you 
 have access to it.
 
 RankingAlgorithm 1.5.4 with complex-lsa supports the entire Lucene 
 Query Syntax , ± and/or boolean/dismax/glob/regular 
 expression/wildcard/fuzzy/prefix/suffix queries with boosting, etc. 
 This version increases performance, with increased accuracy and 
 relevance for Document similarity, fixes problems with phrase queries,  
 Boolean queries, etc.
 
 
 You can get more information about complex-lsa and realtime-search 
 performance from here:
 http://solr-ra.tgels.org/wiki/en/Complex-lsa-demo
 
 You can download Solr 4.10 with RankingAlgorithm 1.5.4 from here:
 http://solr-ra.tgels.org
 
 Please download and give the new version a try.
 
 Regards,
 
 Nagendra Nagarajayya
 http://solr-ra.tgels.org
 http://elasticsearch-ra.tgels.org
 http://rankingalgorithm.tgels.org
 
 Note:
 1. Apache Solr 4.10 with RankingAlgorithm 1.5.4 is an external project.
 
 
 
 



Re: Modify Schema - Schema API

2014-09-10 Thread Joseph Obernberger
Wow - that's really cool!  Thank you!

-Joe

On Wed, Sep 10, 2014 at 12:29 PM, Anshum Gupta ans...@anshumgupta.net
wrote:

 You don't need to bring down the shards/collections, instead here's what
 you can do:
 * Retain the filename (managed_schema, if you didn't change the default
 resource name).
 * Edit the file locally
 * Upload it to replace the current zk file.
 * Reload the collection(s).
 * Reindex

 Here's another thing you can do:
 * Upload the updated configs to zk
 * Create a new collection (different name) using the new configs
 * Reindex data to the new collection.
 * Use collection aliasing to swap the old/new collections.
 (http://www.anshumgupta.net/2013/10/collection-aliasing-in-solrcloud.html)

 All this while, you wouldn't really need to shut down the Solr
 cluster/collection etc.

 On Wed, Sep 10, 2014 at 8:56 AM, Joseph Obernberger 
 joseph.obernber...@gmail.com wrote:

  Thank you - yes that was my question.  I should have stated that it was
 for
  SolrCloud and hence a managed schema.  Could I bring down the shards,
 edit
  the managed schema on zookeeper, fire the shards back up and re-index?
 
  -Joe
 
  On Wed, Sep 10, 2014 at 11:50 AM, Anshum Gupta ans...@anshumgupta.net
  wrote:
 
   Hi Joseph,
  
   It isn't supported by an exiting REST API (if that was your question)
 but
   you can always edit the schema manually (if it isn't managed), upload
 the
   new schema and reload the collections (or cores in case of
 non-SolrCloud
   mode).
  
   Do remember that changing the field type might require you to reindex
  your
   data.
  
   There's an open JIRA for that one and I think someone would get to it
   sometime in the reasonably near future.
   JIRA: https://issues.apache.org/jira/browse/SOLR-5289
  
   On Wed, Sep 10, 2014 at 8:05 AM, Joseph Obernberger 
   joseph.obernber...@gmail.com wrote:
  
In addition to adding new fields to the schema, is there a way to
  modify
   an
existing field?  If I created a field called userID as a long, but
   decided
later that it should be a string?
Thank you!
   
-Joe
   
  
  
  
   --
  
   Anshum Gupta
   http://www.anshumgupta.net
  
 



 --

 Anshum Gupta
 http://www.anshumgupta.net



Re: Edismax mm and efficiency

2014-09-10 Thread Peter Keegan
Sure. I created SOLR-6502. The tricky part was handling the behavior in a
sharded index. When the index is sharded. the response from each shard will
contain a parameter that indicates if the search results are from the
conjunction of all keywords (mm=100%), or from disjunction (mm=1). If the
shards contain both types, then only return the results from the
conjunction. This is necessary in order to get the same results independent
of the number of shards.

Peter

On Wed, Sep 10, 2014 at 11:07 AM, Walter Underwood wun...@wunderwood.org
wrote:

 We do that strict/loose query sequence, but on the client side with two
 requests. Would you consider contributing the QueryComponent?

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/


 On Sep 10, 2014, at 3:47 AM, Peter Keegan peterlkee...@gmail.com wrote:

  I implemented a custom QueryComponent that issues the edismax query with
  mm=100%, and if no results are found, it reissues the query with mm=1.
 This
  doubled our query throughput (compared to mm=1 always), as we do some
  expensive RankQuery processing. For your very long student queries,
 mm=100%
  would obviously be too high, so you'd have to experiment.
 
  On Fri, Sep 5, 2014 at 1:34 PM, Walter Underwood wun...@wunderwood.org
  wrote:
 
  Great!
 
  We have some very long queries, where students paste entire homework
  problems. One of them was 1051 words. Many of them are over 100 words.
 This
  could help.
 
  In the Jira discussion, I saw some comments about handling the most
 sparse
  lists first. We did something like that in the Infoseek Ultra engine
 about
  twenty years ago. Short termlists (documents matching a term) were
  processed first, which kept the in-memory lists of matching docs small.
 It
  also allowed early short-circuiting for no-hits queries.
 
  What would be a high mm value, 75%?
 
  wunder
  Walter Underwood
  wun...@wunderwood.org
  http://observer.wunderwood.org/
 
 
  On Sep 4, 2014, at 11:52 PM, Mikhail Khludnev 
 mkhlud...@griddynamics.com
  wrote:
 
  indeed https://issues.apache.org/jira/browse/LUCENE-4571
  my feeling is it gives a significant gain in mm high values.
 
 
 
  On Fri, Sep 5, 2014 at 3:01 AM, Walter Underwood 
 wun...@wunderwood.org
  wrote:
 
  Are there any speed advantages to using “mm”? I can imagine pruning
 the
  set of matching documents early, which could help, but is that (or
  something else) done?
 
  wunder
  Walter Underwood
  wun...@wunderwood.org
  http://observer.wunderwood.org/
 
 
 
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Principal Engineer,
  Grid Dynamics
 
  http://www.griddynamics.com
  mkhlud...@griddynamics.com
 
 




Re: Reading files in default Conf dir

2014-09-10 Thread Ramana OpenSource
Thank you for the inputs Jorge. Now i am getting the ResourceLoader using
SolrCore API.

Before:
return new HashSetString(new SolrResourceLoader(null).
getLines(stopwords.txt));

After:
 return new HashSetString(core.getResourceLoader().getLines(
stopwords.txt));

I am able to load the resource successfully.

Thanks,
Ramana.


On Wed, Sep 10, 2014 at 12:34 PM, Jorge Luis Betancourt Gonzalez 
jlbetanco...@uci.cu wrote:

 What are you developing a custom search component? update processor? a
 different class for one of the zillion moving parts of Solr?

 If you have access to a SolrCore instance you could use it to get access
 of, essentially using the SolrCore instance specific to the current core
 will cause the lookup of the file to be local to the conf directory of the
 specified core. In a custom UpdateRequestProcessorFactory which implements
 the SolrCoreAware interface I’ve the following code:

@Override
public void inform(SolrCore solrCore) {
SolrResourceLoader loader = solrCore.getResourceLoader();

try {
ListString lines = loader.getLines(patternFile);

if (false == lines.isEmpty()) {
for (String s : lines) {
this.patterns.add(Pattern.compile(s));
}
}
} catch (IOException e) {
SolrCore.log.error(String.format(File %s could not be loaded,
 patternFile));
}

 Essentially I ask the actually core (solrCore) to provide a
 SolrResourceLoader for it’s conf file, in your case you are just passing it
 null, which is causing (I think, haven’t tested) to instantiate a
 SolrResourceLoader of the Solr instance (judging for the paths you’ve
 placed in your mail) instead of a SolrResourceLoader relative to your
 core/collection that is what you want.

 So, bottom line implement the SolrCoreAware interface and use the
 SolrResourceLoader provided by this instance, and a little more info could
 be helpful as we can’t figure what Solr “part” are you developing.

 Regards,

 On Sep 9, 2014, at 2:37 PM, Ramana OpenSource ramanaopensou...@gmail.com
 wrote:

  Hi,
 
  I am trying to load one of the file in conf directory in SOLR, using
 below
  code.
 
  return new HashSetString(new
  SolrResourceLoader(null).getLines(stopwords.txt));
 
  The stopwords.txt file is available in the location
  solr\example\solr\collection1\conf.
 
  When i debugged the SolrResourceLoader API, It is looking at the below
  locations to load the file:
 
  ...solr\example\solr\conf\stopwords.txt
  ...solr\example\stopwords.txt
 
  But as the file was not there in any of above location...it failed.
 
  How to load the files in the default conf directory using
  SolrResourceLoader API ?
 
  I am newbie to SOLR. Any help would be appreciated.
 
  Thanks,
  Ramana.

 Concurso Mi selfie por los 5. Detalles en
 http://justiciaparaloscinco.wordpress.com



How to get access to SolrCore in init method of Handler Class

2014-09-10 Thread Ramana OpenSource
Hi,

I need to load a file in instance's conf directory and this data is going
to be used in handleRequestBody() implementation. As of now, i am loading
the file in the handleRequestBody method like below.

SolrCore solrCore = req.getCore();
solrCore .getResourceLoader().getLines(fileToLoad);

But, To make it better, I would like to load this file only once and in the
init() method of handler class. I am not sure how to get the access of
SolrCore in the init method.

Any help would be appreciated.

Thanks,
Ramana.


Re: How to get access to SolrCore in init method of Handler Class

2014-09-10 Thread Chris Hostetter

: But, To make it better, I would like to load this file only once and in the
: init() method of handler class. I am not sure how to get the access of
: SolrCore in the init method.

you can't access the SolrCore during hte init() method, because at the 
time it's called the SolrCore itself is not yet fully initialized.

what you can do is implement the SolrCoreAware interface, and then 
you will be garunteed that *after* your init method is called, and before 
you are ever asked to handle any requests, your inform(SolrCore) method 
will be called...


-Hoss
http://www.lucidworks.com/


Inconsistent relevancy score between browser refreshes

2014-09-10 Thread Tao, Jing
I am seeing different relevancy scores for the same documents, between browser 
refreshes.  Any ideas why?  The query is the same, index is the same - why 
would score change?

Example:
First request returns:
doc
str name=titleStroke Anticoagulation and Prophylaxis/str
float name=score3.463463/float
/doc
doc
str name=titleHemorrhagic Stroke/str
float name=score3.463463/float
/doc
doc
str name=titleVertebrobasilar Stroke/str
float name=score3.460521/float
/doc

Second request:
doc
str name=titleVertebrobasilar Stroke/str
float name=score3.460521/float
/doc
doc
str name=titleHemorrhagic Stroke/str
float name=score3.4484053/float
/doc
doc
str name=titleStroke Anticoagulation and Prophylaxis/str
float name=score3.4484053/float
/doc

Third request:
doc
str name=titleStroke Anticoagulation and Prophylaxis/str
float name=score3.463463/float
/doc
doc
str name=titleHemorrhagic Stroke/str
float name=score3.463463/float
/doc
doc
str name=titleVertebrobasilar Stroke/str
float name=score3.402718/float
/doc


Jing


Re: Creating Solr servers dynamically in Multicore folder

2014-09-10 Thread Erick Erickson
You should be good to go. Do note that you can the variables that were
defined in your schema.xml in the individual core.properties file for
the core in question if you need to, although the defaults work for
most people's needs.


Best,
Erick

On Tue, Sep 9, 2014 at 9:15 PM, nishwanth nishwanth.vupp...@gmail.com wrote:
 Hello Erick,

 Thanks for the response . My cores got created now after removing the
 core.properties in this location and the existing core folders .

 Also i commented the core related information on solr.xml . Are there going
 to be any further problems with the approach i followed.

 For the new cores i created could see the conf,data and core.properties file
 getting created.

 Thanks..






 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Creating-Solr-servers-dynamically-in-Multicore-folder-tp4157550p4157747.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem while extending TokenizerFactory in Solr 4.4.0

2014-09-10 Thread Erick Erickson
Francesco:

What was the fix? It'll help others with the same issue.

On Wed, Sep 10, 2014 at 6:53 AM, Francesco Valentini
valentin...@gmail.com wrote:
 Hi Shawn,
 thank you very much for your quick anwser,
 I fixed it.

 Thanks
 Francesco

 2014-09-10 15:34 GMT+02:00 Shawn Heisey s...@elyograg.org:

 On 9/10/2014 7:14 AM, Francesco Valentini wrote:
  I’m using Solr 4.4.0 distro and now,  I have a strange issue while
  extending  TokenizerFactory with a custom class.

 I think what we have here is a basic Java error, nothing specific to
 Solr.  This jumps out at me:

 Caused by: java.lang.NoSuchMethodException:
 com.mytest.tokenizer.RelationChunkTokenizerFactory.init(java.util.Map)
 at java.lang.Class.getConstructor0(Unknown Source)
 at java.lang.Class.getConstructor(Unknown Source)
 at

 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:552)
 ... 21 more

 Java is trying to execute a method that doesn't exist.  The
 getConstructor pieces after the message suggest that perhaps it's a
 constructor with a Map as an argument, but I'm not familiar enough with
 this error to know whether it's trying to run a constructor that doesn't
 exist, or whether it's trying to actually use a method called init.

 The constructor in TokenizerFactory is protected, and all of the
 existing descendants that I looked at have a public constructor ... this
 message would make sense in all of the following situations:

 1) You didn't create a constructor for your object with a Map argument.
 2) You made your constructor protected.
 3) You made your constructor private.

 Thanks,
 Shawn




Re: Problems for indexing large documents on SolrCloud

2014-09-10 Thread Erick Erickson
bq: org.apache.solr.common.SolrException: Unexpected end of input
block; expected an identifier

This is very often an indication that your packets are being
truncated by something in the chain. In your case, make sure
that Tomcat is configured to handle inputs of the size that you're sending.

This may be happening before things get to Solr, in which case your settings
in solrconfig.xml aren't germane, the problem is earlier than than.

A semi-smoking-gun here is that there's a size of your multivalued
field that seems to break things... That doesn't rule out time problems
of course.

But I'd look at the Tomcat settings for maximum packet size first.

Best,
Erick

On Wed, Sep 10, 2014 at 9:11 AM, Olivier olivau...@gmail.com wrote:
 Hi,

 I have some problems for indexing large documents in a SolrCloud cluster of
 3 servers  (Solr 4.8.1) with 3 shards and 2 replicas for each shard on
 Tomcat 7.
 For a specific document (with 300 K values in a  multivalued field), I
 couldn't index it on SolrCloud but I could do it in a single instance of
 Solr on my own PC.

 The indexation is done with Solarium from a database. The data indexed are
 e-commerce products with classic fields like name, price, description,
 instock, etc... The large field (type int) is constitued of other products
 ids.
 The only difference with other documents well-indexed on Solr  is the size
 of that multivalued field. Indeed, other documents well-indexed have all
 between 100K values and 200 K values for that field.
 The index size is 11 Mb for 20 documents.

 To solve it, I tried to change several parameters including ZKTimeout in
 solr.xml  :

 In solrcloud section :

 int name=zkClientTimeout6/int

 int name=distribUpdateConnTimeout10/int

 int name=distribUpdateSoTimeout10/int



  In shardHandlerFactory section  :



 int name=socketTimeout${socketTimeout:10}/int

 int name=connTimeout${connTimeout:10}/int


 I also tried to increase these values in solrconfig.xml :

 requestParsers enableRemoteStreaming=true

 multipartUploadLimitInKB=1

 formdataUploadLimitInKB=10

 addHttpRequestToContext=false/




 I also tried to increase the quantity of RAM (there are VMs) : each server
 has 4 Gb of RAM with 3Gb for the JVM.

 Are there other settings which can solve the problem that I would have
 forgotten ?


 The error messages are :

 ERROR

 SolrDispatchFilter

 null:java.lang.RuntimeException: [was class java.net.SocketException]
 Connection reset

 ERROR

 SolrDispatchFilter

 null:ClientAbortException:

 java.net.SocketException:
 broken pipe

 ERROR

 SolrDispatchFilter

 null:ClientAbortException:

 java.net.SocketException:
 broken pipe

 ERROR

 SolrCore

 org.apache.solr.common.SolrException:
   Unexpected end of input
 block; expected an identifier

 ERROR

 SolrCore

 org.apache.solr.common.SolrException:
   Unexpected end of input
 block; expected an identifier

 ERROR

 SolrCore

 org.apache.solr.common.SolrException:
   Unexpected end of input
 block; expected an identifier

 ERROR

 SolrCore

 org.apache.solr.common.SolrException:
   Unexpected EOF in
 attribute value








 Thanks,

 Olivier

 SolrCore

 org.apache.solr.common.SolrException:
   Unexpected end of input
 block in start tag


Re: Inconsistent relevancy score between browser refreshes

2014-09-10 Thread Erick Erickson
More info please.

1 Are there replicas involved?
2 Is there any indexing going on?
3 If more than one node, did you optimize?
4 Did you optimize between refreshes?

Best,
Erick

On Wed, Sep 10, 2014 at 12:28 PM, Tao, Jing j...@webmd.net wrote:
 I am seeing different relevancy scores for the same documents, between 
 browser refreshes.  Any ideas why?  The query is the same, index is the same 
 - why would score change?

 Example:
 First request returns:
 doc
 str name=titleStroke Anticoagulation and Prophylaxis/str
 float name=score3.463463/float
 /doc
 doc
 str name=titleHemorrhagic Stroke/str
 float name=score3.463463/float
 /doc
 doc
 str name=titleVertebrobasilar Stroke/str
 float name=score3.460521/float
 /doc

 Second request:
 doc
 str name=titleVertebrobasilar Stroke/str
 float name=score3.460521/float
 /doc
 doc
 str name=titleHemorrhagic Stroke/str
 float name=score3.4484053/float
 /doc
 doc
 str name=titleStroke Anticoagulation and Prophylaxis/str
 float name=score3.4484053/float
 /doc

 Third request:
 doc
 str name=titleStroke Anticoagulation and Prophylaxis/str
 float name=score3.463463/float
 /doc
 doc
 str name=titleHemorrhagic Stroke/str
 float name=score3.463463/float
 /doc
 doc
 str name=titleVertebrobasilar Stroke/str
 float name=score3.402718/float
 /doc


 Jing


RE: Inconsistent relevancy score between browser refreshes

2014-09-10 Thread Tao, Jing
1) It is a SolrCloud setup on 4 servers, 4 shards, replication factor of 2.
2) There is no indexing going on.
3) No, I did not optimize.
4) Did not optimize between refreshes.

Thanks,
Jing

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, September 10, 2014 4:09 PM
To: solr-user@lucene.apache.org
Subject: Re: Inconsistent relevancy score between browser refreshes

More info please.

1 Are there replicas involved?
2 Is there any indexing going on?
3 If more than one node, did you optimize?
4 Did you optimize between refreshes?

Best,
Erick

On Wed, Sep 10, 2014 at 12:28 PM, Tao, Jing j...@webmd.net wrote:
 I am seeing different relevancy scores for the same documents, between 
 browser refreshes.  Any ideas why?  The query is the same, index is the same 
 - why would score change?

 Example:
 First request returns:
 doc
 str name=titleStroke Anticoagulation and Prophylaxis/str float 
 name=score3.463463/float /doc doc str 
 name=titleHemorrhagic Stroke/str float 
 name=score3.463463/float /doc doc str 
 name=titleVertebrobasilar Stroke/str float 
 name=score3.460521/float /doc

 Second request:
 doc
 str name=titleVertebrobasilar Stroke/str float 
 name=score3.460521/float /doc doc str 
 name=titleHemorrhagic Stroke/str float 
 name=score3.4484053/float /doc doc str name=titleStroke 
 Anticoagulation and Prophylaxis/str float 
 name=score3.4484053/float /doc

 Third request:
 doc
 str name=titleStroke Anticoagulation and Prophylaxis/str float 
 name=score3.463463/float /doc doc str 
 name=titleHemorrhagic Stroke/str float 
 name=score3.463463/float /doc doc str 
 name=titleVertebrobasilar Stroke/str float 
 name=score3.402718/float /doc


 Jing


Re: ExtractingRequestHandler indexing zip files

2014-09-10 Thread keeblerh
Thanks for the info Sergio.  I updated my 4.8.1 version with that patch and
SOLR 4216 (which was really the same thing).  It took a day to get it to
compile on my network and it still doesn't work.  Did my config file look
correct?  I'm wondering if I need another param somewhere.

Patch has to be applied to the source code and compile again Solr.war.
If you do that then it works extracting the content of documents 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4158024.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr WARN Log

2014-09-10 Thread Chris Hostetter
:  I'm trying to upgrade Solr from version 4.2 to 4.9, since then I'm
...
: haven't configured it.  You can ignore this message.  To get it to go

The fact that a WARN is logged at all was a bug in 4.9 that got fixed in 
4.10...

https://issues.apache.org/jira/browse/SOLR-6179


-Hoss
http://www.lucidworks.com/


Re: Problems for indexing large documents on SolrCloud

2014-09-10 Thread Shawn Heisey
On 9/10/2014 2:05 PM, Erick Erickson wrote:
 bq: org.apache.solr.common.SolrException: Unexpected end of input
 block; expected an identifier

 This is very often an indication that your packets are being
 truncated by something in the chain. In your case, make sure
 that Tomcat is configured to handle inputs of the size that you're sending.

 This may be happening before things get to Solr, in which case your settings
 in solrconfig.xml aren't germane, the problem is earlier than than.

 A semi-smoking-gun here is that there's a size of your multivalued
 field that seems to break things... That doesn't rule out time problems
 of course.

 But I'd look at the Tomcat settings for maximum packet size first.

The maximum HTTP request size is actually is controlled by Solr itself
since 4.1, with changes committed for SOLR-4265.  Changing the setting
on Tomcat probably will not help.

An example from my own config which sets this to 32MB - the default is
2048, or 2MB:

 requestParsers enableRemoteStreaming=false
multipartUploadLimitInKB=32768 formdataUploadLimitInKB=32768/

Thanks,
Shawn



Re: How to get access to SolrCore in init method of Handler Class

2014-09-10 Thread Ramana OpenSource
Thanks Chris. I have implemented SolrCoreAware  interface and loading the
required file in the inform method.

Thanks,
Ramana.

On Wed, Sep 10, 2014 at 10:59 PM, Chris Hostetter hossman_luc...@fucit.org
wrote:


 : But, To make it better, I would like to load this file only once and in
 the
 : init() method of handler class. I am not sure how to get the access of
 : SolrCore in the init method.

 you can't access the SolrCore during hte init() method, because at the
 time it's called the SolrCore itself is not yet fully initialized.

 what you can do is implement the SolrCoreAware interface, and then
 you will be garunteed that *after* your init method is called, and before
 you are ever asked to handle any requests, your inform(SolrCore) method
 will be called...


 -Hoss
 http://www.lucidworks.com/



Re: Creating Solr servers dynamically in Multicore folder

2014-09-10 Thread nishwanth
Hello Erick,

Thanks for the response 

I have attached the core.properties and solr.xml for your reference.

.  solr.xml http://lucene.472066.n3.nabble.com/file/n4158124/solr.xml  
core.properties
http://lucene.472066.n3.nabble.com/file/n4158124/core.properties  

Below is our plan on the creating cores.

Every Tenant (user) is  bound to some Contacts,sales,Orders and other
information . Numbers of tenants for our application will be approximately
10,000. 

We are planning to create a Core for every Tenant and maintain the
Contacts,sales,Orders and other information as a collection . So every time
Tenant logs in this information will be used.

Could you please let us know your thoughts on this approach.

Regards,
Nishwanth




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Creating-Solr-servers-dynamically-in-Multicore-folder-tp4157550p4158124.html
Sent from the Solr - User mailing list archive at Nabble.com.