Re: Czech stemmer

2014-09-09 Thread Lukáš Vlček
Hi,

I would recommend you to look at stemmer or token filter based on Hunspell
dictionaries. I am not a Solr user so I can not point you to appropriate
documentation about this but Czech dictionary that can be used with
Hunspell is of high quality. It can be downloaded from OpenOffice here
http://extensions.services.openoffice.org/en/project/czech-dictionary-pack-ceske-slovniky-cs-cz
(distributed under GPL).

Note: when I was looking at it the last time I noticed that the dictionary
contained one broken affix rule which may require manual fix depending on
how strict the rule loaded is in Solr. If you are interested in more
details and can not figure it yourself feel free to ping me again, I can
point you to some resources about how I used it in connection with
Elasticsearch, I assume the basic concepts apply to Solr as well.

Regards,
Lukas

2014-09-09 22:14 GMT+02:00 Shamik Bandopadhyay :

> Hi,
>
>   I'm facing stemming issues with the Czech language search. Solr/Lucene
> currently provides CzechStemFilterFactory as the sole option. Snowball
> Porter doesn't seem to be available for Czech. Here's the issue.
>
> I'm trying to search for "posunout" (means move in English) which returns
> result, but fails if I use ''posunulo" (means moved in English). I used the
> following text as field for search.
>
> "Pomocí multifunkčních uzlů je možné odkazy mnoha způsoby upravovat. Můžete
> přidat a odstranit odkazy, přidat a odstranit vrcholy, prodloužit nebo
> přesunout prodloužení čáry nebo přesunout text odkazu. Přístup k požadované
> možnosti získáte po přesunutí ukazatele myši na uzel. Z uzlu prodloužení
> čáry můžete zvolit tyto možnosti: Protáhnout: Umožňuje posunout prodloužení
> odkazové čáry. Délka prodloužení čáry: Umožňuje prodloužit prodloužení
> čáry. Přidat odkaz: Umožňuje přidat jednu nebo více odkazových čar. Z uzlu
> koncového bodu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
> posunout koncový bod odkazové čáry. Přidat vrchol: Umožňuje přidat vrchol k
> odkazové čáře. Odstranit odkaz: Umožňuje odstranit vybranou odkazovou čáru.
> Z uzlu vrcholu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
> posunout vrchol. Přidat vrchol: Umožňuje přidat vrchol na odkazovou čáru.
> Odstranit vrchol: Umožňuje odstranit vrchol. "
>
> Just wondering if there's a different stemmer available or a way to address
> this.
>
> Schema :
>
>  positionIncrementGap="100" autoGeneratePhraseQueries="true" >
> 
> 
> 
>  words="lang/stopwords_cz.txt" />
>  ignoreCase="true" expand="true"/>
> 
> 
> 
> 
> 
>  words="lang/stopwords_cz.txt" />
> 
> 
> 
>
> Any pointers will be appreciated.
>
> - Thanks,
> Shamik
>


Re: Creating Solr servers dynamically in Multicore folder

2014-09-09 Thread nishwanth
Hello Erick,

Thanks for the response . My cores got created now after removing the
core.properties in this location and the existing core folders . 

Also i commented the core related information on solr.xml . Are there going
to be any further problems with the approach i followed.

For the new cores i created could see the conf,data and core.properties file
getting created.

Thanks..






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Creating-Solr-servers-dynamically-in-Multicore-folder-tp4157550p4157747.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr WARN Log

2014-09-09 Thread Joseph V J
Thank you for the update Shawn.

~Regards
Joe

On Tue, Sep 9, 2014 at 6:14 PM, Shawn Heisey  wrote:

> On 9/9/2014 2:56 AM, Joseph V J wrote:
> > I'm trying to upgrade Solr from version 4.2 to 4.9, since then I'm
> > receiving the following warning from solr log. It would be great if
> anyone
> > could throw some light into it.
> >
> > Level Logger Message
> > WARN ManagedResource *No registered observers for /rest/managed*
> >
> > OS Used : Debian GNU/Linux 7
>
> This message comes from the new Schema REST API.  Basically it means you
> haven't configured it.  You can ignore this message.  To get it to go
> away, you would need to configure the new feature.
>
> https://cwiki.apache.org/confluence/display/solr/Schema+API
>
> Thanks,
> Shawn
>
>


Re: Create collection dynamically in my program

2014-09-09 Thread xinwu
Hi , Jürgen:
Thanks for your reply.
What is the result of the call? Any status or error message?
——The call was ended normally ,and there was no error message.
Did you actually feed data into the collection?
——Yes,I feed data into the daily collection every day.

Thanks!
-Xinwu 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-collection-dynamically-in-my-program-tp4156601p4157742.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr multiple sources configuration

2014-09-09 Thread Jack Krupansky
It is mostly a matter of how you expect to query that data - do you need 
different queries for different sources, or do you have a common conceptual 
model that covers all sources with a common set of queries?


-- Jack Krupansky

-Original Message- 
From: vineet yadav

Sent: Tuesday, September 9, 2014 6:40 PM
To: solr-user@lucene.apache.org
Subject: Solr multiple sources configuration

Hi,
I am using solr to store data from multiple sources like social media,
news, journals etc. So i am using crawler, multiple scrappers and  apis to
gather data. I want to know which is the  best way to configure solr so
that I can store data which comes from multiple sources.

Thanks
Vineet Yadav 



Solr multiple sources configuration

2014-09-09 Thread vineet yadav
Hi,
I am using solr to store data from multiple sources like social media,
news, journals etc. So i am using crawler, multiple scrappers and  apis to
gather data. I want to know which is the  best way to configure solr so
that I can store data which comes from multiple sources.

Thanks
Vineet Yadav


Re: Using def function in fl criteria,

2014-09-09 Thread Chris Hostetter

: I'm trying to use a query with 
fl=name_UK,name_FRA,itemDesc:def(name_UK,name_FRA)
: As you can see, the itemDesc field (builded by solr) is truncated :

functions get their values from the FieldCache (or DocValues if you've 
enabled them) so that they can be efficient across a lot of docs.

based on what you are getting back from the def() funcion, you almost 
certainly have a fieldTYpe for name_UK that uses an analyzer that 
tokenizes the field, so you're getting back one of the indexed terms.

you could theoretically index these fields again using something like 
StrField or KeyworkTokenizerFactory and use that via the def() function -- 
but honestly that's going to be a lot less efficient then just letting 
your client pick bewtween the two values, or writting your own 
DocTransformer to conditionally rename/remove the stored field values you 
don't want...

https://lucene.apache.org/solr/4_10_0/solr-core/org/apache/solr/response/transform/TransformerFactory.html



-Hoss
http://www.lucidworks.com/


Re: How to implement multilingual word components fields schema?

2014-09-09 Thread Paul Libbrecht
Ilia,

one aspect you surely loose with a single field approach is the differentiation 
of semantic fields in different languages for words that sounds the same.
The words "sitting" and "directions" are easy example that have fully different 
semantics in French and English, at least.
"directions" would appear common with, say, teacher advice in English but not 
in French.

I disagree that the storage should be an issue in your case…. most solr 
installations do not suffer from that, as far as I can read the list. 
Generally, you do not need all these stemmed fields to be stored, they're just 
indexed and that is pretty tiny a storage.

Using separate fields also has advantages in terms of IDF, I think.

I do not understand the last question to Tom, he provides URLs to at least one 
of the papers.

Also, if you can put a hand on it, the book of Peters, Braschler, and Clough is 
probably relevant: http://link.springer.com/book/10.1007%2F978-3-642-23008-0 
but, as the first article referenced by Tom says, the CLIR approach here relies 
on parallel corpora, e.g. created by automatic translations.


Paul




On 8 sept. 2014, at 07:33, Ilia Sretenskii  wrote:

> Thank you for the replies, guys!
> 
> Using field-per-language approach for multilingual content is the last
> thing I would try since my actual task is to implement a search
> functionality which would implement relatively the same possibilities for
> every known world language.
> The closest references are those popular web search engines, they seem to
> serve worldwide users with their different languages and even
> cross-language queries as well.
> Thus, a field-per-language approach would be a sure waste of storage
> resources due to the high number of duplicates, since there are over 200
> known languages.
> I really would like to keep single field for cross-language searchable text
> content, witout splitting it into specific language fields or specific
> language cores.
> 
> So my current choice will be to stay with just the ICUTokenizer and
> ICUFoldingFilter as they are without any language specific
> stemmers/lemmatizers yet at all.
> 
> Probably I will put the most popular languages stop words filters and
> stemmers into the same one searchable text field to give it a try and see
> if it works correctly in a stack.
> Does specific language related filters stacking work correctly in one field?
> 
> Further development will most likely involve some advanced custom analyzers
> like the "SimplePolyGlotStemmingTokenFilter" to utilize the ICU generated
> ScriptAttribute.
> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/100236
> https://github.com/whateverdood/cross-lingual-search/blob/master/src/main/java/org/apache/lucene/sandbox/analysis/polyglot/SimplePolyGlotStemmingTokenFilter.java
> 
> So I would like to know more about those "academic papers on this issue of
> how best to deal with mixed language/mixed script queries and documents".
> Tom, could you please share them?



Wildcard in FL parameter not working with Solr 4.10.0

2014-09-09 Thread Mike Hugo
Hello,

With Solr 4.7 we had some queries that return dynamic fields by passing in
a fl=*_exact parameter; this is not working for us after upgrading to Solr
4.10.0.  This appears to only be a problem when requesting wildcarded
fields via SolrJ

With Solr 4.10.0 - I downloaded the binary and set up the example:

cd example
java -jar start.jar
java -jar post.jar solr.xml monitor.xml

In a browser, if I request

http://localhost:8983/solr/collection1/select?q=*:*&wt=json&indent=true
*&fl=*d*

All is well with the world:

{"responseHeader": {"status": 0,"QTime": 1,"params": {"fl": "*d","indent": "
true","q": "*:*","wt": "json"}},"response": {"numFound": 2,"start": 0,"docs
": [{"id": "SOLR1000"},{"id": "3007WFP"}]}}

However if I do the same query with SolrJ (groovy script)


@Grab(group = 'org.apache.solr', module = 'solr-solrj', version = '4.10.0')

import org.apache.solr.client.solrj.SolrQuery
import org.apache.solr.client.solrj.impl.HttpSolrServer

HttpSolrServer solrServer = new HttpSolrServer("
http://localhost:8983/solr/collection1";)
SolrQuery q = new SolrQuery("*:*")
*q.setFields("*d")*
println solrServer.query(q)


No fields are returned:

{responseHeader={status=0,QTime=0,params={fl=*d,q=*:*,wt=javabin,version=2}},response={numFound=2,start=0,docs=[*SolrDocument{},
SolrDocument{}*]}}



Any ideas as to why when using SolrJ wildcarded fl fields are not returned?

Thanks,

Mike


Czech stemmer

2014-09-09 Thread Shamik Bandopadhyay
Hi,

  I'm facing stemming issues with the Czech language search. Solr/Lucene
currently provides CzechStemFilterFactory as the sole option. Snowball
Porter doesn't seem to be available for Czech. Here's the issue.

I'm trying to search for "posunout" (means move in English) which returns
result, but fails if I use ''posunulo" (means moved in English). I used the
following text as field for search.

"Pomocí multifunkčních uzlů je možné odkazy mnoha způsoby upravovat. Můžete
přidat a odstranit odkazy, přidat a odstranit vrcholy, prodloužit nebo
přesunout prodloužení čáry nebo přesunout text odkazu. Přístup k požadované
možnosti získáte po přesunutí ukazatele myši na uzel. Z uzlu prodloužení
čáry můžete zvolit tyto možnosti: Protáhnout: Umožňuje posunout prodloužení
odkazové čáry. Délka prodloužení čáry: Umožňuje prodloužit prodloužení
čáry. Přidat odkaz: Umožňuje přidat jednu nebo více odkazových čar. Z uzlu
koncového bodu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
posunout koncový bod odkazové čáry. Přidat vrchol: Umožňuje přidat vrchol k
odkazové čáře. Odstranit odkaz: Umožňuje odstranit vybranou odkazovou čáru.
Z uzlu vrcholu odkazu můžete zvolit tyto možnosti: Protáhnout: Umožňuje
posunout vrchol. Přidat vrchol: Umožňuje přidat vrchol na odkazovou čáru.
Odstranit vrchol: Umožňuje odstranit vrchol. "

Just wondering if there's a different stemmer available or a way to address
this.

Schema :

















Any pointers will be appreciated.

- Thanks,
Shamik


Re: ExtractingRequestHandler indexing zip files

2014-09-09 Thread marotosg
hi keeblerh,

Patch has to be applied to the source code and compile again Solr.war.
If you do that then it works extracting the content of documents

Regards,
Sergio



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4157673.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using def function in fl criteria,

2014-09-09 Thread Pigeyre Romain
I want to return :

-the field name_UK (if it exists)

-Otherwise the name_FRA field
... into an alias field (itemDesc, created at query time).

There is no schema definition for itemDesc because, it is only a virtual field 
declared in fl= criteria. I don't understand while filter is applying to this 
field.

On Tue, Sep 9, 2014 at 17:44 AM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:

> I'm really confused about what you're trying to do here. What do you
> intend the syntax
> itemDesc:def(name_UK,name_FRA)
> to do?
>
> It's also really difficult to say much of anything unless we see the
> schema definition for "itemDesc" and sample input.
>
> Likely you're somehow applying an analysis chain that is truncating
> the input. Or it's also possible that you aren't indexing quite what
> you think you are.
>
> Best,
> Erick
>
> On Tue, Sep 9, 2014 at 4:36 AM, Pigeyre Romain 
> mailto:romain.pige...@sopra.com>> wrote:
> > Hi
> >
> > I'm trying to use a query with 
> > fl=name_UK,name_FRA,itemDesc:def(name_UK,name_FRA)
> > As you can see, the itemDesc field (builded by solr) is truncated :
> >
> > {
> > "name_UK": "MEN S SUIT\n",
> > "name_FRA": "24 RELAX 2 BTS ST GERMAIN TOILE FLAMMEE LIN ET SOIE",
> > "itemDesc": "suit"
> >   }
> >
> > Do you have any idea to change it?
> >
> > Thanks.
> >
> > Regards,
> >
> > Romain




Re: [Announce] Apache Solr 4.10 with RankingAlgorithm 1.5.4 available now with complex-lsa algorithm (simulates human language acquisition and recognition)

2014-09-09 Thread Alexandre Rafalovitch
On Tue, Sep 9, 2014 at 1:38 PM, Diego Fernandez  wrote:
> Interesting.  Does anyone know how that compares to this 
> http://www.searchbox.com/products/searchbox-plugins/solr-sense/?

Well, for one, the Solr-sense pricing seems to be so sense-tive that
you have to contact the sales team to find it out. The version
announced here is free for public and commercial use AFAIK.

I have not tested either one yet.

Regards,
   Alex.

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


Reading files in default Conf dir

2014-09-09 Thread Ramana OpenSource
Hi,

I am trying to load one of the file in conf directory in SOLR, using below
code.

return new HashSet(new
SolrResourceLoader(null).getLines("stopwords.txt"));

The "stopwords.txt" file is available in the location
"solr\example\solr\collection1\conf".

When i debugged the SolrResourceLoader API, It is looking at the below
locations to load the file:

...solr\example\solr\conf\stopwords.txt
...solr\example\stopwords.txt

But as the file was not there in any of above location...it failed.

How to load the files in the default conf directory using
SolrResourceLoader API ?

I am newbie to SOLR. Any help would be appreciated.

Thanks,
Ramana.


field specified edismax

2014-09-09 Thread Jae Joo
Any way to apply different edismax parameter to field by field?
For ex.
q=keywords:(lung cancer) AND title:chemotherapy

I would like to apply different qf for fields, keywords and title.
f.keywords.qf=keywords^40 subkeywords^20
f.title.qf=title^80 subtitle^20

I know it can be done by field aliasing, but doesn't like to use field
aliasing.

Thanks,

Jae


Re: Using def function in fl criteria,

2014-09-09 Thread Erick Erickson
I'm really confused about what you're trying to do here. What do you
intend the syntax
itemDesc:def(name_UK,name_FRA)
to do?

It's also really difficult to say much of anything unless we see the
schema definition for "itemDesc" and sample input.

Likely you're somehow applying an analysis chain that is truncating
the input. Or it's also possible that you aren't indexing quite what
you think you are.

Best,
Erick

On Tue, Sep 9, 2014 at 4:36 AM, Pigeyre Romain  wrote:
> Hi
>
> I'm trying to use a query with 
> fl=name_UK,name_FRA,itemDesc:def(name_UK,name_FRA)
> As you can see, the itemDesc field (builded by solr) is truncated :
>
> {
> "name_UK": "MEN S SUIT\n",
> "name_FRA": "24 RELAX 2 BTS ST GERMAIN TOILE FLAMMEE LIN ET SOIE",
> "itemDesc": "suit"
>   }
>
> Do you have any idea to change it?
>
> Thanks.
>
> Regards,
>
> Romain


Re: ExtractingRequestHandler indexing zip files

2014-09-09 Thread keeblerh
I am also having the issue where my zip contents (or kmz contents) are not
being processed - only the file names are processed.  It seems to recognize
the kmz extension and open the file just doesn't recurse the processing on
the contents.
The patch you mention has been around for a while.  I am running solr 4.8.1
and looks like the tika jar is 1.5. So I would think the patch would be
included already.  Do I need additional configuration?  My config is as
follows: 







and I am using the dataImport option from the admin page  Thanks for any
assistance - I'm on a closed network and getting patches to it are not
trival.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/ExtractingRequestHandler-indexing-zip-files-tp4138172p4157650.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Creating Solr servers dynamically in Multicore folder

2014-09-09 Thread Erick Erickson
Well, you already have a core.properties file defined in that
location. I presume you're operating in "core discovery" mode. Your
cores would all be very confused if new cores were defined over top of
old cores.

It is a little clumsy at this point in that you have to have a conf
directory in place but _not_ a core.properties file to create a core
like this. Config sets will eventually fix this.

Best,
Erick

On Mon, Sep 8, 2014 at 11:00 PM, nishwanth  wrote:
> Hello ,
>
> I  am using solr 4.8.1 Version and and i am trying to create the cores
> dynamically on server start up using the following piece of code.
>
>  HttpSolrServer s = new HttpSolrServer( url );
> s.setParser(new BinaryResponseParser());
> s.setRequestWriter(new BinaryRequestWriter());
> SolrServer server = s;
> String instanceDir ="/opt/solr/core/multicore/";
> CoreAdminResponse e =  new CoreAdminRequest().createCore(name,
> instanceDir,
> server,"/opt/solr/core/multicore/solrconfig.xml","/opt/solr/core/multicore/schema.xml");
>
> I am getting the error
>
> org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error
> CREA
> TEing SolrCore 'hellocore': Could not create a new core in
> /opt/solr/core/multic
> ore/as another core is already defined there
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSo
> lrServer.java:554)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServ
> er.java:210)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServ
> er.java:206)
> at
> org.apache.solr.client.solrj.request.CoreAdminRequest.process(CoreAdm
> inRequest.java:503)
> at
> org.apache.solr.client.solrj.request.CoreAdminRequest.createCore(Core
> AdminRequest.java:580)
> at
> org.apache.solr.client.solrj.request.CoreAdminRequest.createCore(Core
> AdminRequest.java:560)
> at
> app.services.OperativeAdminScheduler.scheduleTask(OperativeAdminSched
> uler.java:154)
> at Global.onStart(Global.java:31)
>
> I am still getting the above error even  though the core0 and core1 folders
> in multicore are deleted and the same is commented in
> /opt/solr/core/multicore/solrconfig.xml. Also i enabled persistent=true in
> the solrconfig.xml
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Creating-Solr-servers-dynamically-in-Multicore-folder-tp4157550.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Announce] Apache Solr 4.10 with RankingAlgorithm 1.5.4 available now with complex-lsa algorithm (simulates human language acquisition and recognition)

2014-09-09 Thread Diego Fernandez
Interesting.  Does anyone know how that compares to this 
http://www.searchbox.com/products/searchbox-plugins/solr-sense/?

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


- Original Message -
> Hi!
> 
> I am very excited to announce the availability of Apache Solr 4.10 with
> RankingAlgorithm 1.5.4.
> 
> Solr 4.10.0 with RankingAlgorithm 1.5.4 includes support for complex-lsa.
> complex-lsa simulates human language acquisition and recognition (see demo
>  ) and can retrieve
> semantically related/hidden relationships between terms, sentences,
> paragraphs, chapters, books, images, etc. Three new similarities,
> TERM_SIMILARITY, DOCUMENT_SIMILARITY, TERM_DOCUMENT_SIMILARITY enable these
> with improved precision.  A query for “holy AND ghost” returns jesus/christ
> as the top results for the bible corpus with no effort to introduce this
> relationship (see demo  ).
> 
>  
> 
> This version adds support for  multiple linear algebra libraries.
> complex-lsa does a large amount of this calcs so speeding this up should
> speed up the retrieval etc. EJML is the fastest if you are using complex-lsa
> for a smaller set of documents, while MTJ is faster as your document
> collection becomes bigger. MTJ can also use BLAS/LAPACK, etc installed on
> your system to further improve performance with native execution. The
> performance is similar to a C/C++ application. It can also make use of GPUs
> or Intel's mkl library if you have access to it.
> 
> RankingAlgorithm 1.5.4 with complex-lsa supports the entire Lucene Query
> Syntax , ± and/or boolean/dismax/glob/regular
> expression/wildcard/fuzzy/prefix/suffix queries with boosting, etc. This
> version increases performance, with increased accuracy and relevance for
> Document similarity, fixes problems with phrase queries,  Boolean queries,
> etc.
> 
> 
> You can get more information about complex-lsa and realtime-search
> performance from here:
> http://solr-ra.tgels.org/wiki/en/Complex-lsa-demo
> 
> You can download Solr 4.10 with RankingAlgorithm 1.5.4 from here:
> http://solr-ra.tgels.org
> 
> Please download and give the new version a try.
> 
> Regards,
> 
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://elasticsearch-ra.tgels.org
> http://rankingalgorithm.tgels.org
> 
> Note:
> 1. Apache Solr 4.10 with RankingAlgorithm 1.5.4 is an external project.
> 
> 
> 
> 


Re: Solr Sharding Help

2014-09-09 Thread Ethan
Thanks Jeff.  I had different idea of how replicationFactor worked.  I was
able to create the setup with that command.

Now as I import data into the cluster how can I determine that it's being
sharding?

On Mon, Sep 8, 2014 at 1:52 PM, Jeff Wartes  wrote:

>
> You need to specify a replication factor of 2 if you want two copies of
> each shard. Solr doesn¹t ³auto fill² available capacity, contrary to the
> misleading examples on the http://wiki.apache.org/solr/SolrCloud page.
> Those examples only have that behavior because they ask you to copy the
> examples directory, which brings some on-disk configuration with it.
>
>
>
> On 9/8/14, 1:33 PM, "Ethan"  wrote:
>
> >Thanks Erick.  That cleared my confusion.
> >
> >I have a follow up question -  If I run the CREATE command with 4 nodes in
> >createNodeSet, I thought 2 leaders and 2 followers will be created
> >automatically. Thats not the case, however.
> >
> >
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShar
> >ds=2&maxShardsPerNode=1&createNodeSet=
> > serv001:5258_solr, serv002:5258_solr,serv003:5258_solr, serv004:5258_solr
> >
> >I still get the same response.  I see 2 leaders being created, but I do
> >not
> >see other 2 nodes show up as followers in the cloud page in Solr Admin UI.
> > It looks like collection was not created for those 2 nodes at all.
> >
> >Is there additional step involved to add them?
> >
> >On Mon, Sep 8, 2014 at 12:11 PM, Erick Erickson 
> >wrote:
> >
> >> Ahhh, this is a continual source of confusion. I've started a one-man
> >> campaign to talk about "leaders" and "followers" when relevant...
> >>
> >> _Every_ node is a "replica". This is because a node can be a leader or
> >> follower, and the role can change.
> >>
> >> So your case is entirely normal. These nodes are probably the leaders
> >> too, and will remain so while you add more replicas/followers.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Sep 8, 2014 at 11:20 AM, Ethan  wrote:
> >> > I am trying to setup 2 shard cluster with 2 replicas with dedicated
> >>nodes
> >> > for replicas.  I have 4 node SolrCloud setup that I am trying to shard
> >> > using collections api .. (Like
> >> >
> >>
> >>
> https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_
> >>shard_replicas_and_zookeeper_ensemble
> >> > )
> >> >
> >> > I ran this command -
> >> >
> >> >
> >>
> >>
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numSha
> >>rds=2&maxShardsPerNode=1&createNodeSet=
> >> >  serv001:5258_solr, serv002:5258_solr
> >> >
> >> > Response -
> >> >
> >> > 
> >> > 
> >> > 0
> >> > 3932
> >> > 
> >> > 
> >> > 
> >> > 
> >> > 0
> >> > 2982
> >> > 
> >> > Main_shard2_replica1
> >> > 
> >> > 
> >> > 
> >> > 0
> >> > 3005
> >> > 
> >> > Main_shard1_replica1
> >> > 
> >> > 
> >> > 
> >> >
> >> > I want to know what *_replica1 or *_replica2 means?  Are they actually
> >> > replicas and not the shards?  I intended to add 2 more nodes as
> >>dedicated
> >> > replication nodes.  How to accomplish that?
> >> >
> >> > Would appreciate any pointers.
> >> >
> >> > -E
> >>
>
>


Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread jimtronic
So, the problem I found that's driving this is that I have several phrase
synonyms set up. For example, "ipod mini" into "ipad mini". This synonym is
only applied if you submit it as a phrase in quotes. 

So, the pf param doesn't help because it's not the right phrase in the first
place.

I can fix this by sending in the query as ("ipod mini" ipod mini).





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157637.html
Sent from the Solr - User mailing list archive at Nabble.com.


[Announce] Apache Solr 4.10 with RankingAlgorithm 1.5.4 available now with complex-lsa algorithm (simulates human language acquisition and recognition)

2014-09-09 Thread nnagarajayya
Hi!

I am very excited to announce the availability of Apache Solr 4.10 with
RankingAlgorithm 1.5.4. 

Solr 4.10.0 with RankingAlgorithm 1.5.4 includes support for complex-lsa.
complex-lsa simulates human language acquisition and recognition (see demo
 ) and can retrieve
semantically related/hidden relationships between terms, sentences,
paragraphs, chapters, books, images, etc. Three new similarities,
TERM_SIMILARITY, DOCUMENT_SIMILARITY, TERM_DOCUMENT_SIMILARITY enable these
with improved precision.  A query for “holy AND ghost” returns jesus/christ
as the top results for the bible corpus with no effort to introduce this
relationship (see demo  ).

 

This version adds support for  multiple linear algebra libraries.
complex-lsa does a large amount of this calcs so speeding this up should
speed up the retrieval etc. EJML is the fastest if you are using complex-lsa
for a smaller set of documents, while MTJ is faster as your document
collection becomes bigger. MTJ can also use BLAS/LAPACK, etc installed on
your system to further improve performance with native execution. The
performance is similar to a C/C++ application. It can also make use of GPUs
or Intel's mkl library if you have access to it.

RankingAlgorithm 1.5.4 with complex-lsa supports the entire Lucene Query
Syntax , ± and/or boolean/dismax/glob/regular
expression/wildcard/fuzzy/prefix/suffix queries with boosting, etc. This
version increases performance, with increased accuracy and relevance for
Document similarity, fixes problems with phrase queries,  Boolean queries,
etc.


You can get more information about complex-lsa and realtime-search
performance from here: 
http://solr-ra.tgels.org/wiki/en/Complex-lsa-demo

You can download Solr 4.10 with RankingAlgorithm 1.5.4 from here: 
http://solr-ra.tgels.org

Please download and give the new version a try.

Regards, 

Nagendra Nagarajayya 
http://solr-ra.tgels.org 
http://elasticsearch-ra.tgels.org 
http://rankingalgorithm.tgels.org 

Note: 
1. Apache Solr 4.10 with RankingAlgorithm 1.5.4 is an external project. 





Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Salman Akram
So realistically speaking you cannot have SolrCloud work for 2 data centers
as a redundant solution because no matter how many nodes you add you still
would need at least 1 node in the 2nd center working too.

So that just leaves with non-SolrCloud solutions.

"1) Change the replication config to redefine the master and reload the core
or restart Solr."

That of course is a simple way but the real issue is about the possible
issues and some good practices e.g. normally the scenario would be that
primary data center goes down for few hours and till then we upgrade one of
the slaves in secondary to a master. Now

- IF there is no lag there won't be any issue in secondary at least but
what if there is lag and one of the files is not completely replicated?
That file would be discarded or there is a possibility that whole index is
not usable?

- Once the primary comes back how would we now copy the delta from
secondary? Make it a slave of secondary first, replicate the delta and then
set it as a master again?

In other words is there a good guide out there for this with possible
issues and solutions? Definitely before SolrCloud people would be doing
this and even now SolrCloud doesn't seem practical in quite a few
situations.

Thanks again!!

On Tue, Sep 9, 2014 at 8:02 PM, Shawn Heisey  wrote:

> On 9/9/2014 8:46 AM, Salman Akram wrote:
> > You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
> > primary and 1 in secondary and we normally keep the secondary down would
> > that work? Basically secondary network is just for redundancy and won't
> be
> > as fast so normally we won't like to shift traffic there.
> >
> > So can we just have nodes for redundancy and NOT load balancing i.e. it
> has
> > 3 nodes but update is only on one of them? Similarly for the slave
> replicas
> > can we limit the searches to a certain slave or it will be auto balanced?
> >
> > Also apart from SOLR cloud is it possible to have multiple master in SOLR
> > or a good guide to upgrade a slave to master?
>
> You must have three zookeeper nodes for a redundant setup.  If you only
> have two data centers, then you must put at least two of those nodes in
> one data center.  If the data center with two zookeeper nodes goes down,
> zookeeper cannot function, which means SolrCloud will not work
> correctly.  There is no way to maintain SolrCloud redundancy with only
> two data centers.  You might think to add a fourth ZK node and split
> them between the data centers ... except that in that situation, at
> least three nodes must be functional.  Two out of four nodes is not enough.
>
> A minimal fault-tolerant SolrCloud install is three physical machines.
> Two of them run ZK and Solr, one of them runs ZK only.
>
> If you don't use SolrCloud, then you have two choices to switch masters:
>
> 1) Change the replication config to redefine the master and reload the
> core or restart Solr.
> 2) Write scripts that manually use the replication HTTP API to do all
> your replication, rather than let Solr handle it automatically.  You can
> choose the master for every replication with HTTP calls.
>
> https://wiki.apache.org/solr/SolrReplication#HTTP_API
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram


Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Shawn Heisey
On 9/9/2014 8:46 AM, Salman Akram wrote:
> You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
> primary and 1 in secondary and we normally keep the secondary down would
> that work? Basically secondary network is just for redundancy and won't be
> as fast so normally we won't like to shift traffic there.
>
> So can we just have nodes for redundancy and NOT load balancing i.e. it has
> 3 nodes but update is only on one of them? Similarly for the slave replicas
> can we limit the searches to a certain slave or it will be auto balanced?
>
> Also apart from SOLR cloud is it possible to have multiple master in SOLR
> or a good guide to upgrade a slave to master?

You must have three zookeeper nodes for a redundant setup.  If you only
have two data centers, then you must put at least two of those nodes in
one data center.  If the data center with two zookeeper nodes goes down,
zookeeper cannot function, which means SolrCloud will not work
correctly.  There is no way to maintain SolrCloud redundancy with only
two data centers.  You might think to add a fourth ZK node and split
them between the data centers ... except that in that situation, at
least three nodes must be functional.  Two out of four nodes is not enough.

A minimal fault-tolerant SolrCloud install is three physical machines. 
Two of them run ZK and Solr, one of them runs ZK only.

If you don't use SolrCloud, then you have two choices to switch masters:

1) Change the replication config to redefine the master and reload the
core or restart Solr.
2) Write scripts that manually use the replication HTTP API to do all
your replication, rather than let Solr handle it automatically.  You can
choose the master for every replication with HTTP calls.

https://wiki.apache.org/solr/SolrReplication#HTTP_API

Thanks,
Shawn



Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread Shawn Heisey
On 8/28/2014 7:43 AM, jimtronic wrote:
> I would like to send only one query to my custom request handler and have the
> request handler expand that query into a more complicated query.
>
> Example:
>
> */myHandler?q=kids+books*
>
> ... would turn into a more complicated EDismax query of:
>
> *"kids books" kids books*
>
> Is this achievable via a Request Handler definition in solrconfig.xml?

As someone else already said, you can write a custom request handler and
reference it in a handler definition in your solrconfig.xml file.  The
sky's the limit for that -- if you can write the code, Solr will use it.

This *specific* example that you've given is something that the edismax
parser will give you out of the box, when you define the qf and pf
parameters.  It will automatically search the individual terms you give
on the fields in the qf parameter, *and* do a phrase search for all
those terms on the fields in the pf parameter.

https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
http://wiki.apache.org/solr/ExtendedDisMax

Thanks,
Shawn



Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Salman Akram
You mean 3 'data centers' or 'nodes'? I am thinking if we have 2 nodes on
primary and 1 in secondary and we normally keep the secondary down would
that work? Basically secondary network is just for redundancy and won't be
as fast so normally we won't like to shift traffic there.

So can we just have nodes for redundancy and NOT load balancing i.e. it has
3 nodes but update is only on one of them? Similarly for the slave replicas
can we limit the searches to a certain slave or it will be auto balanced?

Also apart from SOLR cloud is it possible to have multiple master in SOLR
or a good guide to upgrade a slave to master?

Thanks

On Tue, Sep 9, 2014 at 5:40 PM, Shawn Heisey  wrote:

> On 9/8/2014 9:54 PM, Salman Akram wrote:
> > We have a redundant data center in case the primary goes down. Currently
> we
> > have 1 master and multiple slaves on primary data center. This master
> also
> > replicates to a slave in secondary data center. So if the primary goes
> down
> > at least the read only part works. However, now we want writes to work on
> > secondary data center too when primary goes down.
> >
> > - Is it possible in SOLR to have Master - Master?
> > - If not then what's the best strategy to upgrade a slave to master?
> > - Naturally there would be some latency due to data centers being in
> > different geographical locations so what are the normal data issues and
> > best practices in case primary goes down? We would also like to shift
> back
> > to primary as soon as its back.
>
> SolrCloud would work, but only if you have *three* datacenters.  Two of
> them would need to remain fully operational.  SolrCloud is a true
> cluster -- there is no master.  Each of the shards in a collection has
> one or more replicas.  One of the replicas gets elected to be leader,
> but the leader designation can change.
>
> The reason that you need three is because of zookeeper, which is the
> software that actually maintains the cluster and handles leader
> elections.  A majority of zookeeper nodes (more than half of them) must
> be operational for zookeeper to maintain quorum.  That means that the
> minimum number of zookeepers is three, and in a three-node system, one
> can go down without disrupting operation.
>
> One thing that SolrCloud doesn't yet have is rack/datacenter awareness.
>  Requests get load balanced across the entire cluster, regardless of
> where they are located.  It's something that will eventually come, but I
> don't have any kind of estimate for when.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram


Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread jmlucjav
this is easily doable by a custom (java code) request handler. If you want
to avoid writing any java code, you should investigate using
https://issues.apache.org/jira/browse/SOLR-5005 (I am myself going to have
a look at this interesting feature)

On Tue, Sep 9, 2014 at 4:33 PM, jimtronic  wrote:

> Never got a response on this ... Just looking for the best way to handle
> it?
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157613.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Using a RequestHandler to expand query parameter

2014-09-09 Thread jimtronic
Never got a response on this ... Just looking for the best way to handle it?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157613.html
Sent from the Solr - User mailing list archive at Nabble.com.


Send nested doc with solrJ

2014-09-09 Thread Ali Nazemian
Dear all,
Hi,
I was wondering how can I use solrJ for sending nested document to solr?
Unfortunately I did not find any tutorial for this purpose. I really
appreciate if you can guide me through that. Thank you very much.
Best regards.

-- 
A.Nazemian


Re: Chronological partitioning of data - what does Solr offer in this area?

2014-09-09 Thread Alexandre Rafalovitch
Have you looked at collection aliasing already?
http://www.anshumgupta.net/2013/10/collection-aliasing-in-solrcloud.html

Regards,
   Alex
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Tue, Sep 9, 2014 at 9:23 AM, Gili Nachum  wrote:
> Hello!
>
> *Does Solr support any sort of chronological ordering of data?*
> I would like to divide my data to: Daily, weekly, monthly, yearly parts.
> For performance sake.
> Has anyone done something like this over SolrCloud?
>
> More thoughts:
> While Indexing: I'm soft committing every 2 seconds so I would rather do
> that on the daily index only (open reader and cache invalidation effort) as
> the total index shard size is >200GB
> While Searching: I would rather search first the daily and weekly parts,
> moving to older data only if results are not satisfying.
>
> I guess there's a challenge to move the daily data to the weekly data at
> the end of a day, and so on
> Anything build-in that goes in this direction? If not, any example of some
> custom collection/sharding configuration?
>
> Thanks.
> Gili.


Sorting docs by Hamming distance

2014-09-09 Thread michael.boom
Hi,

Did anybody try to embed into Solr sorting based on the Hamming distance on
a certain field. http://en.wikipedia.org/wiki/Hamming_distance
E.g. having a document doc1 with a field doc_hash:"12345678" and doc2 with
doc_hash:"12345699".
When searching for doc_hash:"123456780" the sort order should be ->
doc1,doc2.

What would be the best way to achieve this kind of behaviour? Writing a
plugin or maybe a custom function query ?

Thanks!




-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sorting-docs-by-Hamming-distance-tp4157600.html
Sent from the Solr - User mailing list archive at Nabble.com.


Chronological partitioning of data - what does Solr offer in this area?

2014-09-09 Thread Gili Nachum
Hello!

*Does Solr support any sort of chronological ordering of data?*
I would like to divide my data to: Daily, weekly, monthly, yearly parts.
For performance sake.
Has anyone done something like this over SolrCloud?

More thoughts:
While Indexing: I'm soft committing every 2 seconds so I would rather do
that on the daily index only (open reader and cache invalidation effort) as
the total index shard size is >200GB
While Searching: I would rather search first the daily and weekly parts,
moving to older data only if results are not satisfying.

I guess there's a challenge to move the daily data to the weekly data at
the end of a day, and so on
Anything build-in that goes in this direction? If not, any example of some
custom collection/sharding configuration?

Thanks.
Gili.


Re: Solr WARN Log

2014-09-09 Thread Shawn Heisey
On 9/9/2014 2:56 AM, Joseph V J wrote:
> I'm trying to upgrade Solr from version 4.2 to 4.9, since then I'm
> receiving the following warning from solr log. It would be great if anyone
> could throw some light into it.
> 
> Level Logger Message
> WARN ManagedResource *No registered observers for /rest/managed*
> 
> OS Used : Debian GNU/Linux 7

This message comes from the new Schema REST API.  Basically it means you
haven't configured it.  You can ignore this message.  To get it to go
away, you would need to configure the new feature.

https://cwiki.apache.org/confluence/display/solr/Schema+API

Thanks,
Shawn



Re: Master - Master / Upgrading a slave to master

2014-09-09 Thread Shawn Heisey
On 9/8/2014 9:54 PM, Salman Akram wrote:
> We have a redundant data center in case the primary goes down. Currently we
> have 1 master and multiple slaves on primary data center. This master also
> replicates to a slave in secondary data center. So if the primary goes down
> at least the read only part works. However, now we want writes to work on
> secondary data center too when primary goes down.
> 
> - Is it possible in SOLR to have Master - Master?
> - If not then what's the best strategy to upgrade a slave to master?
> - Naturally there would be some latency due to data centers being in
> different geographical locations so what are the normal data issues and
> best practices in case primary goes down? We would also like to shift back
> to primary as soon as its back.

SolrCloud would work, but only if you have *three* datacenters.  Two of
them would need to remain fully operational.  SolrCloud is a true
cluster -- there is no master.  Each of the shards in a collection has
one or more replicas.  One of the replicas gets elected to be leader,
but the leader designation can change.

The reason that you need three is because of zookeeper, which is the
software that actually maintains the cluster and handles leader
elections.  A majority of zookeeper nodes (more than half of them) must
be operational for zookeeper to maintain quorum.  That means that the
minimum number of zookeepers is three, and in a three-node system, one
can go down without disrupting operation.

One thing that SolrCloud doesn't yet have is rack/datacenter awareness.
 Requests get load balanced across the entire cluster, regardless of
where they are located.  It's something that will eventually come, but I
don't have any kind of estimate for when.

Thanks,
Shawn



Using def function in fl criteria,

2014-09-09 Thread Pigeyre Romain
Hi

I'm trying to use a query with 
fl=name_UK,name_FRA,itemDesc:def(name_UK,name_FRA)
As you can see, the itemDesc field (builded by solr) is truncated :

{
"name_UK": "MEN S SUIT\n",
"name_FRA": "24 RELAX 2 BTS ST GERMAIN TOILE FLAMMEE LIN ET SOIE",
"itemDesc": "suit"
  }

Do you have any idea to change it?

Thanks.

Regards,

Romain


Re: Language detection for multivalued field

2014-09-09 Thread lsanchez
Hi all,
I don't know if this can help somebody, I've changed the method process of
the class LanguageIdentifierUpdateProcessor in order to support of
multivalued fields and it works pretty well


protected SolrInputDocument process(SolrInputDocument doc) {
String docLang = null;
HashSet docLangs = new HashSet();
String fallbackLang = getFallbackLang(doc, fallbackFields,
fallbackValue);

if(langField == null || !doc.containsKey(langField) ||
(doc.containsKey(langField) && overwrite)) {
  String allText = concatFields(doc, inputFields);
  List languagelist = detectLanguage(allText);
  docLang = resolveLanguage(languagelist, fallbackLang);
  docLangs.add(docLang);
  log.debug("Detected main document language from fields " +
inputFields.toString() + ": "+docLang);

  if(doc.containsKey(langField) && overwrite) {
log.debug("Overwritten old value "+doc.getFieldValue(langField));
  }
  if(langField != null && langField.length() != 0) {
doc.setField(langField, docLang);
  }
} else {
  // langField is set, we sanity check it against whitelist and fallback
  docLang = resolveLanguage((String) doc.getFieldValue(langField),
fallbackLang);
  docLangs.add(docLang);
  log.debug("Field "+langField+" already contained value "+docLang+",
not overwriting.");
}

if(enableMapping) {
  for (String fieldName : allMapFieldsSet) {
if(doc.containsKey(fieldName)) {
  String fieldLang="";
  if(mapIndividual && mapIndividualFieldsSet.contains(fieldName)) {

Collection c = doc.getFieldValues(fieldName);
for (Object o : c){
if(o instanceof String ){
List languagelist =
detectLanguage((String) o);
fieldLang = resolveLanguage(languagelist, docLang);
docLangs.add(fieldLang);
log.debug("Mapping multivalued  field "+fieldName+"
using individually detected language "+fieldLang);
String mappedOutputField = getMappedField(fieldName,
fieldLang);
if (mappedOutputField != null) {
log.debug("Mapping multivalued field {} to {}",
doc.getFieldValue(docIdField), fieldLang);
SolrInputField inField = new SolrInputField
(fieldName);
Collection currentContent
=doc.getFieldValues(mappedOutputField);
if (currentContent != null &&
currentContent.size()>0){
doc.addField(mappedOutputField, o);

}
else{
inField.setValue(o,
doc.getField(fieldName).getBoost());
doc.setField(mappedOutputField,
inField.getValue(), inField.getBoost());
}

   

if(!mapKeepOrig) {
  log.debug("Removing old field {}", fieldName);
  doc.removeField(fieldName);
}
  } else {
throw new
SolrException(SolrException.ErrorCode.BAD_REQUEST, "Invalid output field
mapping for "
+ fieldName + " field and language: " +
fieldLang);
  }
}
}

  } else {

fieldLang = docLang;
log.debug("Mapping field "+fieldName+" using document global
language "+fieldLang);
String mappedOutputField = getMappedField(fieldName, fieldLang);

if (mappedOutputField != null) {
  log.debug("Mapping field {} to {}",
doc.getFieldValue(docIdField), fieldLang);
  SolrInputField inField = doc.getField(fieldName);
  doc.setField(mappedOutputField, inField.getValue(),
inField.getBoost());
  if(!mapKeepOrig) {
log.debug("Removing old field {}", fieldName);
doc.removeField(fieldName);
  }
} else {
  throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
"Invalid output field mapping for "
  + fieldName + " field and language: " + fieldLang);
}
  }
  
}
  }
}

// Set the languages field to an array of all detected languages
if(langsField != null && langsField.length() != 0) {
  doc.setField(langsField, docLangs.toArray());
}

return doc;
  }



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Language-detection-for-multivalued-field-tp4096996p4157573.html
Sent from the Solr - User mailing list archive at Nabble.com.


OpenNLP integration with Solr

2014-09-09 Thread Ankur Dulwani
I am using Solr 4.9 and want to integrate openNLP with it. I ran the patch
successfully  LUCENE-2899
  , the patch ran
successfully and following are my changes in schema.xmlBut no proper
outcomes can be seen. It is not recognizing the Named Entities like person,
organization etc, instead it gives all the text in person field.What am I
doing wrong, please help



--
View this message in context: 
http://lucene.472066.n3.nabble.com/OpenNLP-integration-with-Solr-tp4157569.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr WARN Log

2014-09-09 Thread Joseph V J
Hi,

I'm trying to upgrade Solr from version 4.2 to 4.9, since then I'm
receiving the following warning from solr log. It would be great if anyone
could throw some light into it.

Level Logger Message
WARN ManagedResource *No registered observers for /rest/managed*

OS Used : Debian GNU/Linux 7

~Thanks
Joe


Re: SOLR tuning

2014-09-09 Thread J'roo
BTW - there are no HTTP timeouts active, so this part is OK



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-tuning-tp4157561p4157562.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR tuning

2014-09-09 Thread van...@bluewin.ch
Hi,

Newbie testing on laptop here. I've got two cores (shards) in my datastore.
When searching I get this error if result is above approx. 200'000 records,
below 200'000 it returns fine. I thought it was simply a case of upping Java
heap size, but no luck. I do not want to use start/cursorMark in this case,
anyone know what to tune exactly in order to get more than 200'000 records
back? Running in JBOSS.

Thanks!

09:25:53,812 ERROR [SolrCore] java.lang.NullPointerException
at java.io.StringReader.(StringReader.java:33)
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:203)
at
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:80)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:101)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
com.incentage.ipc.index.InitializerDispatchFilter.doFilter(InitializerDispatchFilter.java:94)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
at
org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126)
at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:829)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:598)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)

09:25:53,812 INFO  [SolrCore] [index.part.201409] webapp=/index path=/select
params={} status=500 QTime=0
09:25:53,812 ERROR [SolrDispatchFilter] java.lang.NullPointerException
at java.io.StringReader.(StringReader.java:33)
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:203)
at
org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:80)
at org.apache.solr.search.QParser.getQuery(QParser.java:142)
at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:101)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
com.incentage.ipc.index.InitializerDispatchFilter.doFilter(InitializerDispatchFilter.java:94)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at