Re: Solr 4.1/4.2 - SolrException: Error opening new searcher - With JUnit test class

2013-03-14 Thread mark12345
I wrote a simple test to reproduce a very similar stack trace to the above
issue, where only some line numbers differences.

Any ideas as to why the following happens?  Any help would be very
appreciated.




* The test case:

> @Test
> public void documentCommitAndRollbackTest() throws Exception {
> 
> // Fix:  SolrException: Error opening new searcher
> 
> server.rollback();
> server.commit();
> }


* The similar stack trace (Which is repeated twice):

> Mar 15, 2013 3:48:09 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
> at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1415)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1527)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1304)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:570)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95)
> at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1055)
> at
> org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
> at
> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
> at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579)
> at
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:307)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this
> IndexWriter is closed
> at
> org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:583)
> at
> org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:597)
> at
> org.apache.lucene.index.IndexWriter.nrtIsCurrent(IndexWriter.java:4143)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:266)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
> at
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169)
> at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1360)
> ... 29 more
> 



* The test class:

> package org.apache.lucene.solr;
> 
> import java.io.Serializable;
> import java.util.Date;
> import java.util.List;
> import java.util.Locale;
> import java.util.UUID;
> 
> import junit.framework.Assert;
> 
> import org.apache.solr.client.solrj.SolrQuery;
> import org.apache.solr.client.solrj.beans.Field;
> import org.apache.solr.client.solrj.impl.BinaryRequestWriter;
> import org.apache.solr.client.solrj.impl.HttpSolrServer;
> import org.apache.solr.client.solrj.response.QueryResponse;
> import org.apache.solr.client.solrj.response.

Re: Solr Replication

2013-03-14 Thread vicky desai
Hi,

I have a multi core setup and there is continuous updation going on in each
core. Hence I dont prefer a bckup as it would either cause a downtime or if
during a backup there is a write activity my backup will be corrupted. Can
you please suggest if there is a cleaner way to handle this



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266p4047591.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR Num Docs vs NumFound

2013-03-14 Thread Nathan Findley
On my solr 4 setup a query returns a higher "NumFound" value during a 
*:* query than the "Num Docs" value reported on the statistics page of 
collection1. Why is that? My data is split across 3 data import handlers 
where each handler has the same type of data but the ids are guaranteed 
to be different.


Are some of my documents not hard commited? If so, how do I hard commit. 
Otherwise, why are these numbers different?


--
CTO
Zenlok株式会社



Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-14 Thread Felipe Lahti
Hi!

Take a look on
http://wiki.apache.org/solr/SchemaXml#Common_field_options
parameter "*omitTermFreqAndPositions"*

or you can use a custom similarity class that overrides the term freq and
return one for only that field.
http://wiki.apache.org/solr/SchemaXml#Similarity

  



  


Best,

On Wed, Mar 13, 2013 at 8:43 PM, roz dev  wrote:

> Hi All
>
> I am wondering if there is a way to alter term frequency of a certain field
> as 1, even if there are multiple matches in that document?
>
> Use Case is:
>
> Let's say that I have a document with 2 fields
>
> - Name and
> - Description
>
> And, there is a document with data like this
>
> Document_1
> Name = Blue Jeans
> Description = This jeans is very soft.  Jeans is pretty nice.
>
> Now, If I Search for "Jeans" then "Jeans" is found in 2 places in
> Description field.
>
> Term Frequency for Description is 2
>
> I want Solr to count term frequency for Description as 1 even if "Jeans" is
> found multiple times in this field.
>
> For all other fields, i do want to get the term frequency, as it is.
>
> Is this doable in Solr with any of the functions?
>
> Any inputs are welcome.
>
> Thanks
> Saroj
>



-- 
Felipe Lahti
Consultant Developer - ThoughtWorks Porto Alegre


Re: discovery-based core enumeration with embedded solr

2013-03-14 Thread Erick Erickson
H, could you raise a JIRA and assign it to me? Please be sure and
emphasize that it's embedded because I'm pretty sure this is fine for the
regular case.

But I have to admit that the embedded case completely slipped under the
radar.

Even better if you could make a test case, but that might not be
straightforward...

Thanks,
Erick


On Wed, Mar 13, 2013 at 5:28 PM, Michael Sokolov <
msoko...@safaribooksonline.com> wrote:

> Has the new core enumeration strategy been implemented in the
> CoreContainer.Initializer.**initialize() code path?  It doesn't seem like
> it has.
>
> I get this exception:
>
> Caused by: org.apache.solr.common.**SolrException: Could not load config
> for solrconfig.xml
> at org.apache.solr.core.**CoreContainer.createFromLocal(**
> CoreContainer.java:991)
> at org.apache.solr.core.**CoreContainer.create(**
> CoreContainer.java:1051)
> ... 10 more
> Caused by: java.io.IOException: Can't find resource 'solrconfig.xml' in
> classpath or 'solr-multi/collection1/conf/'**, cwd=/proj/lux
> at org.apache.solr.core.**SolrResourceLoader.**openResource(**
> SolrResourceLoader.java:318)
> at org.apache.solr.core.**SolrResourceLoader.openConfig(**
> SolrResourceLoader.java:283)
> at org.apache.solr.core.Config.<**init>(Config.java:103)
> at org.apache.solr.core.Config.<**init>(Config.java:73)
> at org.apache.solr.core.**SolrConfig.(SolrConfig.**java:117)
> at org.apache.solr.core.**CoreContainer.createFromLocal(**
> CoreContainer.java:989)
> ... 11 more
>
> even though I have a solr.properties file in solr-multi (which is my
> solr.home), and core.properties in some subdirectories of that
>
> --
> Michael Sokolov
> Senior Architect
> Safari Books Online
>
>


Re: Embedded Solr

2013-03-14 Thread rulinma
give u to test embeded solr:

import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;

public class EmbededSolrTest {

private static int commitNum = 5000;

private static String path =
"/home/solr/Rollin/solr-4.1.0/embeddedExample";

/**
 * @param args
 * @throws Exception
 */
public static void main(String[] args) throws Exception {
if(args != null) {
if(args.length > 0) {
path = args[0].trim();
}
if(args.length > 1) {
commitNum = Integer.parseInt(args[1].trim());
}
}
//path = "D:\\program\\solr\\41embededtest";
System.setProperty("solr.solr.home", path);
CoreContainer.Initializer initializer = new 
CoreContainer.Initializer();
CoreContainer coreContainer = initializer.initialize();
EmbeddedSolrServer server = new 
EmbeddedSolrServer(coreContainer, "");
addIndex(server);
//query(server);
//deleteAllDoc(server);
}

public static void query(SolrServer server) throws Exception {
try {
SolrQuery q = new SolrQuery();
q.setQuery("*:*");
q.setStart(0);
q.setRows(20);
SolrDocumentList list = server.query(q).getResults();
System.out.println(list.getNumFound());
} catch(Exception e) {
e.printStackTrace();
} finally {
server.shutdown();
}
}

public static void deleteAllDoc(SolrServer server) throws Exception {
try {
server.deleteByQuery("*:*");
server.commit();
query(server);
} catch(Exception e) {
e.printStackTrace();
} finally {
server.shutdown();
}
}

public static void addIndex(SolrServer solrServer) throws IOException,
ParseException {

String path = "index";
Analyzer analyzer = new SimpleAnalyzer(Version.LUCENE_35);
//Analyzer analyzer = new SimpleAnalyzer();
Directory directonry = FSDirectory.open(new File(path));
IndexReader ireader = IndexReader.open(directonry);
IndexSearcher isearcher = new IndexSearcher(ireader);
QueryParser parser = new QueryParser(Version.LUCENE_35, "", 
analyzer);
Query query = parser.parse("*:*");
TopDocs hits = isearcher.search(query, null, 100);
System.out.println("find size: " + hits.totalHits);
java.net.InetAddress addr = java.net.InetAddress.getLocalHost();
String computerName = addr.getHostName();
//insert2Solr(solrServer, isearcher, hits);
long beginTime = System.currentTimeMillis();
long totalTime = 0;
System.out.println("begin time: " + beginTime);
try {
Collection docs = new 
ArrayList();
for(int i = 0; i < hits.scoreDocs.length; i ++ ) {
SolrInputDocument doc = new SolrInputDocument();
Document hitDoc = 
isearcher.doc(hits.scoreDocs[i].doc);
doc.addField("id", i + "a" + computerName +
Thread.currentThread().getId());


doc.addField("text", hitDoc.get("text"));
docs.add(doc);

Re: Advice: solrCloud + DIH

2013-03-14 Thread rulinma
3docs/s is lower, I test with 4 node is more 1000docs/s and 4k/doc with
solrcloud. Every leader has a replica.

I am tuning to improve to 3000docs/s. 3docs/s is too slow.

3x!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047559.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Lance Norskog
Wow! That's great. And it's a lot of work, especially getting it all 
keyboard-complete. Thank you.


On 03/14/2013 01:29 AM, Chantal Ackermann wrote:

Hi all,


this is not a question. I just wanted to announce that I've written a blog post 
on how to set up Maven for packaging and automatic testing of a SOLR index 
configuration.

http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/

Feedback or comments appreciated!
And again, thanks for that great piece of software.

Chantal





Re: Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Otis Gospodnetic
Hi Scott,

Not sure why IW would be closed, but:
* consider not (hard) committing after each doc, but just periodically,
every N minutes
* soft committing instead
* using 4.2

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Thu, Mar 14, 2013 at 11:55 AM, Danzig, Scott wrote:

> Hey all,
>
> We're using a Solr 4 core to handle our article data.  When someone in our
> CMS publishes an article, we have a listener that indexes it straight to
> solr.  We use the previously instantiated HttpSolrServer, build the solr
> document, add it with server.add(doc) .. then do a server.commit() right
> away.  For some reason, sometimes this exception is thrown, which I suspect
> is related to a simultaneous data import done from another client which
> sometimes errors:
>
> Feb 26, 2013 5:07:51 PM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Error opening new
> searcher
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1310)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1422)
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1200)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
> at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
> at
> org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
> at
> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
> at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
> at
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this
> IndexWriter is closed
> at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:550)
> at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:563)
> at org.apache.lucene.index.IndexWriter.nrtIsCurrent(IndexWriter.java:4196)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:266)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
> at
> org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
> at
> org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169)
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1256)
> ... 28 more
>
> I'm not sure if the error is causing the IndexWriter to close, and why an
> IndexWriter would be shared across clients, but usually, I can get around
> this by basically creating a new HttpSolrServer and trying again.  But it
> doesn't always work, perhaps due to frequency… I don't like the idea of an
> "infinite loop of creating connections until it works".  I'd rather
> understand what's going on.  What's the proper way to fix this?  I see I
> can add a doc with a commitWithMs of "0" and maybe this couples the add
> tightly with the commit and would prevent interference.  But am I totally
> off the mark here as to the problem?  Suggestions?
>
> Posted this on java-u

Re: Question about email search

2013-03-14 Thread Alexandre Rafalovitch
Sure. copyField it into a new indexed non-stored field with the following
type definition:

  


  


Content of filter_email.txt is (including <> signs):


You will have the emails only left as tokens. Can't display them easily,
but can certainly search.
Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Mar 14, 2013 at 2:33 PM, Jorge Luis Betancourt Gonzalez <
jlbetanco...@uci.cu> wrote:

> Sorry for the duplicated mail :-(, any advice on a configuration for
> searching emails in a field that does not have only email addresses, so the
> email addresses are contained in larger textual messages?
>
> - Mensaje original -
> De: "Ahmet Arslan" 
> Para: solr-user@lucene.apache.org
> Enviados: Jueves, 14 de Marzo 2013 11:23:47
> Asunto: Re: Question about email search
>
> Hi,
>
> Since you have word delimiter filter in your analysis chain, I am not sure
> if e-mail addresses are recognised. You can check that on solr admin UI,
> analysis page.
>
> If e-mail addresses kept one token, I would use leading wildcard query.
> &q=*@gmail.com
>
> There was a similar question recently:
> http://search-lucene.com/m/XF2ejnM6Vi2
>
> --- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez 
> wrote:
>
> > From: Jorge Luis Betancourt Gonzalez 
> > Subject: Question about email search
> > To: solr-user@lucene.apache.org
> > Date: Thursday, March 14, 2013, 5:11 PM
> > I'm using solr 3.6.2 to crawl some
> > data using nutch, in my schema I've one field with all the
> > content extracted from the page, which could possibly
> > include email addresses, this is the configuration of my
> > schema:
> >
> >  > class="solr.TextField"
> >
> > positionIncrementGap="100"
> > autoGeneratePhraseQueries="true">
> >  > type="index">
> >
> > 
> >
> > 
> >
> > 
> >
> >  > languange="Spanish"/>
> >
> > 
> >
> >  >
> > ignoreCase="true" words="stopwords.txt"/>
> >
> >  >
> > generateWordParts="1"
> > generateNumberParts="1"
> >
> > catenateWords="1" catenateNumbers="1"
> > catenateAll="0"
> >
> > splitOnCaseChange="1"/>
> >
> > 
> >
> >  > class="solr.RemoveDuplicatesTokenFilterFactory"/>
> > 
> > 
> >
> > The thing is that I'm trying to search against a field of
> > this type (text) with a value like "@gmail.com" and I'm
> > intended to get documents with that text, any advice?
> >
> > slds
> > --
> > "It is only in the mysterious equation of love that any
> > logical reasons can be found."
> > "Good programmers often confuse halloween (31 OCT) with
> > christmas (25 DEC)"
> >
> >
>


Re: Solr indexing binary files

2013-03-14 Thread Jack Krupansky

Take a look at Solr Cell:

http://wiki.apache.org/solr/ExtractingRequestHandler

Include a dynamicField with a "*" pattern and you will see the wide variety 
of metadata that is available for PDF and other rich document formats.


-- Jack Krupansky

-Original Message- 
From: Luis

Sent: Thursday, March 14, 2013 3:30 PM
To: solr-user@lucene.apache.org
Subject: Solr indexing binary files

Hi, I am new with Solr and I am extracting metadata from binary files 
through

URLs stored in my database.  I would like to know what fields are available
for indexing from PDFs (the ones that would be initiated as in column=””).
For example how would I extract something like file size, format or file
type.

I would also like to know how to create customized fields in Solr.  How
those metadata and text content are mapped into Solr schema?  Would I have
to declare that in the solrconfig.xml or do some more tweaking somewhere
else?  If someone has a code snippet that could show me it would be greatly
appreciated.

Thank you in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Solr indexing binary files

2013-03-14 Thread Luis
Hi, I am new with Solr and I am extracting metadata from binary files through
URLs stored in my database.  I would like to know what fields are available
for indexing from PDFs (the ones that would be initiated as in column=””). 
For example how would I extract something like file size, format or file
type.  

I would also like to know how to create customized fields in Solr.  How
those metadata and text content are mapped into Solr schema?  Would I have
to declare that in the solrconfig.xml or do some more tweaking somewhere
else?  If someone has a code snippet that could show me it would be greatly
appreciated.

Thank you in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Mark Miller
Something like 'Reader is Current' might be better. Personally, I don't even 
know if it's worth showing.

- Mark

On Mar 14, 2013, at 3:40 PM, Stefan Matheis  wrote:

> Perhaps the wording of "Current" is a bit too generic in that context? I'd 
> like to change that description if that clarifies things .. but not sure 
> which one is a better fit?
> 
> 
> 
> On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote:
> 
>> Stefan,
>> 
>> Thanks a lot! Makes sense. So I don't have to worry about my leader
>> thinking it's out of date, then.
>> 
>> Michael Della Bitta
>> 
>> 
>> Appinions
>> 18 East 41st Street, 2nd Floor
>> New York, NY 10017-6271
>> 
>> www.appinions.com (http://www.appinions.com)
>> 
>> Where Influence Isn’t a Game
>> 
>> 
>> On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
>> mailto:matheis.ste...@gmail.com)> wrote:
>>> Hey Michael
>>> 
>>> I was a bit confused because you mentioned SolrCloud in the subject. We're 
>>> talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
>>> the left-upper Box "Statistics" ?
>>> 
>>> If so, the Output comes from /solr/collection1/admin/luke ( 
>>> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
>>>  ) which uses DirectoryReader.isCurrent() under the Hood.
>>> 
>>> That method contains a explanation in its javadocs: 
>>> http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()
>>> 
>>> HTH
>>> Stefan
>>> 
>>> 
>>> 
>>> On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:
>>> 
 Hi everyone,
 
 Is there an official definition of the "Current" flag under Core >
 Home > Statistics?
 
 What would it mean if a shard leader is not "Current"?
 
 Thanks,
 
 Michael Della Bitta
 
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
 
 www.appinions.com (http://www.appinions.com)
 
 Where Influence Isn’t a Game  
> 
> 



Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Perhaps the wording of "Current" is a bit too generic in that context? I'd like 
to change that description if that clarifies things .. but not sure which one 
is a better fit?



On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote:

> Stefan,
>  
> Thanks a lot! Makes sense. So I don't have to worry about my leader
> thinking it's out of date, then.
>  
> Michael Della Bitta
>  
> 
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>  
> www.appinions.com (http://www.appinions.com)
>  
> Where Influence Isn’t a Game
>  
>  
> On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
> mailto:matheis.ste...@gmail.com)> wrote:
> > Hey Michael
> >  
> > I was a bit confused because you mentioned SolrCloud in the subject. We're 
> > talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
> > the left-upper Box "Statistics" ?
> >  
> > If so, the Output comes from /solr/collection1/admin/luke ( 
> > http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
> >  ) which uses DirectoryReader.isCurrent() under the Hood.
> >  
> > That method contains a explanation in its javadocs: 
> > http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()
> >  
> > HTH
> > Stefan
> >  
> >  
> >  
> > On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:
> >  
> > > Hi everyone,
> > >  
> > > Is there an official definition of the "Current" flag under Core >
> > > Home > Statistics?
> > >  
> > > What would it mean if a shard leader is not "Current"?
> > >  
> > > Thanks,
> > >  
> > > Michael Della Bitta
> > >  
> > > 
> > > Appinions
> > > 18 East 41st Street, 2nd Floor
> > > New York, NY 10017-6271
> > >  
> > > www.appinions.com (http://www.appinions.com)
> > >  
> > > Where Influence Isn’t a Game  




Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Stefan,

Thanks a lot! Makes sense. So I don't have to worry about my leader
thinking it's out of date, then.

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
 wrote:
> Hey Michael
>
> I was a bit confused because you mentioned SolrCloud in the subject. We're 
> talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
> the left-upper Box "Statistics" ?
>
> If so, the Output comes from /solr/collection1/admin/luke ( 
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
>  ) which uses DirectoryReader.isCurrent() under the Hood.
>
> That method contains a explanation in its javadocs: 
> http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()
>
> HTH
> Stefan
>
>
>
> On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:
>
>> Hi everyone,
>>
>> Is there an official definition of the "Current" flag under Core >
>> Home > Statistics?
>>
>> What would it mean if a shard leader is not "Current"?
>>
>> Thanks,
>>
>> Michael Della Bitta
>>
>> 
>> Appinions
>> 18 East 41st Street, 2nd Floor
>> New York, NY 10017-6271
>>
>> www.appinions.com (http://www.appinions.com)
>>
>> Where Influence Isn’t a Game
>
>


Re: Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I'm assuming from that link I would use the following:

/Query all shards of multiple compatible collections, explicitly specified:/
http://localhost:8983/solr/collection1/select?collection=collection1_NY,collection1_NJ,collection1_CT

where collection1_NY, NJ and CT could be books, movies, music in my case.
But what does it mean to be a 'compatible' collection?  If that just means
the schema.xml for each of them has to have a common field or fields that
can be searched against, then I'm golden. But if 'compatible' means
something more specific I would need a definition to see if this will work
for me.

Ken 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-across-multiple-collections-cores-tp4047457p4047466.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Hey Michael

I was a bit confused because you mentioned SolrCloud in the subject. We're 
talking about http://host:port/solr/#/collection1 (f.e.) right? And there, the 
left-upper Box "Statistics" ?

If so, the Output comes from /solr/collection1/admin/luke ( 
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
 ) which uses DirectoryReader.isCurrent() under the Hood.

That method contains a explanation in its javadocs: 
http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()

HTH
Stefan



On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:

> Hi everyone,
>  
> Is there an official definition of the "Current" flag under Core >
> Home > Statistics?
>  
> What would it mean if a shard leader is not "Current"?
>  
> Thanks,
>  
> Michael Della Bitta
>  
> 
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>  
> www.appinions.com (http://www.appinions.com)
>  
> Where Influence Isn’t a Game  




Re: Searching across multiple collections (cores)

2013-03-14 Thread Mark Miller
Yes, with SolrCloud, it's just the collection param (as long as the schemas are 
compatible for this):

http://wiki.apache.org/solr/SolrCloud#Distributed_Requests

- Mark

On Mar 14, 2013, at 2:55 PM, kfdroid  wrote:

> I've been looking all over for a clear answer to this question and can't seem
> to find one. It seems like a very basic concept to me though so maybe I'm
> using the wrong terminology.  I want to be able to search across multiple
> collections (as it is now called in SolrCloud world, previously called
> Cores).  I want the scoring, sorting, faceting etc. to be blended, that is
> to be relevant to data from all the collections, not just a set of
> independent results per collection.  Is that possible?
> 
> A real-world example would be a merchandise site that has books, movies and
> music. The index for each of those is quite different and they would have
> their own schema.xml (and therefore be their own Collection). When in the
> 'books' area of a website the users could search on fields specific to books
> (ISBN for example). However on a 'home' page a search would span across all
> 3 product lines, and the results should be scored relative to each other,
> not just relative to other items in their specific collection. 
> 
> Is this possible in v4.0? I'm pretty sure it wasn't in v1.4.1. But it seems
> to be a fundamentally useful concept, I was wondering if it had been
> addressed yet.
> Thanks,
> Ken
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Searching-across-multiple-collections-cores-tp4047457.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I've been looking all over for a clear answer to this question and can't seem
to find one. It seems like a very basic concept to me though so maybe I'm
using the wrong terminology.  I want to be able to search across multiple
collections (as it is now called in SolrCloud world, previously called
Cores).  I want the scoring, sorting, faceting etc. to be blended, that is
to be relevant to data from all the collections, not just a set of
independent results per collection.  Is that possible?

A real-world example would be a merchandise site that has books, movies and
music. The index for each of those is quite different and they would have
their own schema.xml (and therefore be their own Collection). When in the
'books' area of a website the users could search on fields specific to books
(ISBN for example). However on a 'home' page a search would span across all
3 product lines, and the results should be scored relative to each other,
not just relative to other items in their specific collection. 

Is this possible in v4.0? I'm pretty sure it wasn't in v1.4.1. But it seems
to be a fundamentally useful concept, I was wondering if it had been
addressed yet.
Thanks,
Ken



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-across-multiple-collections-cores-tp4047457.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
Sorry for the duplicated mail :-(, any advice on a configuration for searching 
emails in a field that does not have only email addresses, so the email 
addresses are contained in larger textual messages?

- Mensaje original -
De: "Ahmet Arslan" 
Para: solr-user@lucene.apache.org
Enviados: Jueves, 14 de Marzo 2013 11:23:47
Asunto: Re: Question about email search

Hi,

Since you have word delimiter filter in your analysis chain, I am not sure if 
e-mail addresses are recognised. You can check that on solr admin UI, analysis 
page.

If e-mail addresses kept one token, I would use leading wildcard query.
&q=*@gmail.com

There was a similar question recently:
http://search-lucene.com/m/XF2ejnM6Vi2

--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez  wrote:

> From: Jorge Luis Betancourt Gonzalez 
> Subject: Question about email search
> To: solr-user@lucene.apache.org
> Date: Thursday, March 14, 2013, 5:11 PM
> I'm using solr 3.6.2 to crawl some
> data using nutch, in my schema I've one field with all the
> content extracted from the page, which could possibly
> include email addresses, this is the configuration of my
> schema:
>
>          class="solr.TextField"
>            
> positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
>              type="index">
>                
> 
>                
> 
>                
> 
>                
>  languange="Spanish"/>
>                
> 
>                
>                 
>     ignoreCase="true" words="stopwords.txt"/>
>                
>                 
>     generateWordParts="1"
> generateNumberParts="1"   
>                
>     catenateWords="1" catenateNumbers="1"
> catenateAll="0"
>                
>     splitOnCaseChange="1"/>
>                
> 
>                
>  class="solr.RemoveDuplicatesTokenFilterFactory"/>
>             
>         
>
> The thing is that I'm trying to search against a field of
> this type (text) with a value like "@gmail.com" and I'm
> intended to get documents with that text, any advice?
>
> slds
> --
> "It is only in the mysterious equation of love that any
> logical reasons can be found."
> "Good programmers often confuse halloween (31 OCT) with
> christmas (25 DEC)"
>
>


Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-14 Thread Chris Hostetter

: It looks strange to me that if there is no document yet (foundVersion < 0)
: then the only case when document will be imported is when input version is
: negative. Guess I need to test specific cases using SolrJ or smth. to be sure.

you're assuming that if foundVersion < 0 that means no document *yet* ... 
it could also mean there was a document, and it's been deleted.

Either way if the client has said "(replace|update) version X of doc D" 
the code is failing because it can't: doc D does not exist with version 
X.  Regardless of whether someone deleted doc D, or replaced it it with a 
newer version, or it never existed i nthe first place, Solr can't do what 
you asked it to do.

: Anyway I'll also check if I can inherit from SolrEntityProcessor and override
: _version_ field there before insertion.

Easier solutions to consider (off the cuff, not tested)...

1) on in your SolrEntityProcessor, configure fl with something like this 
to alias the _version_ field to something else

   fl=*,old_version:_version_

2) configure your destination solr instance with an update chain that 
ignores the _version_ field (you wouldn't want this for most normal usage, 
but it would be suitable for thiese conds of from scratch imports from 
other solr instances)...

https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html



-Hoss


Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
The log of the UI 

null:org.apache.solr.common.SolrException: Error trying to proxy request for
url: http://192.168.20.47:8983/solr/ST-3A856BBCA3_12/select

I will open the issue in Jira.

Thanks



-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433p4047440.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread Mark Miller
I'll add a test with rows = 0 and see how easy it is to replicate.

Looks to me like you should file a JIRA issue in any case.

- Mark

On Mar 14, 2013, at 2:04 PM, yriveiro  wrote:

> Hi, 
> 
> I think that in solr 4.2 the new feature to proxy a request if the
> collection is not in the requested node has a bug.
> 
> If I do a query with the parameter rows=0 and the node doesn't have the
> collection. If the parameter is rows=4 or superior then the search works as
> expected 
> 
> the curl returns 
> 
> The output of wget is:
> 
> Connecting to 192.168.20.48:8983... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 210 [application/xml]
> Saving to: ‘select?q=*:*&rows=0’
> 
> 0% [  
>
> ] 0   --.-K/s   in 0s
> 
> 2013-03-14 18:01:04 (0.00 B/s) - Connection closed at byte 0. Retrying.
> 
> Curl says:
> 
> curl 
> "http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*&rows=0";
> curl: (56) Problem (2) in the Chunked-Encoded data
> 
> Chrome says:
> 
> This webpage is not available
> The webpage at
> http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*&rows=0&wt=xml&indent=true
> might be temporarily down or it may have moved permanently to a new web
> address.
> Error 321 (net::ERR_INVALID_CHUNKED_ENCODING): Unknown error.
> 
> Someone have the same issue?
> 
> 
> 
> 
> 
> -
> Best regards
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
Hi, 

I think that in solr 4.2 the new feature to proxy a request if the
collection is not in the requested node has a bug.

If I do a query with the parameter rows=0 and the node doesn't have the
collection. If the parameter is rows=4 or superior then the search works as
expected 

the curl returns 

The output of wget is:

Connecting to 192.168.20.48:8983... connected.
HTTP request sent, awaiting response... 200 OK
Length: 210 [application/xml]
Saving to: ‘select?q=*:*&rows=0’

 0% [   
  
] 0   --.-K/s   in 0s

2013-03-14 18:01:04 (0.00 B/s) - Connection closed at byte 0. Retrying.

Curl says:

curl 
"http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*&rows=0";
curl: (56) Problem (2) in the Chunked-Encoded data

Chrome says:

This webpage is not available
The webpage at
http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*&rows=0&wt=xml&indent=true
might be temporarily down or it may have moved permanently to a new web
address.
Error 321 (net::ERR_INVALID_CHUNKED_ENCODING): Unknown error.

Someone have the same issue?





-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433.html
Sent from the Solr - User mailing list archive at Nabble.com.



Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Hi everyone,

Is there an official definition of the "Current" flag under Core >
Home > Statistics?

What would it mean if a shard leader is not "Current"?

Thanks,

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


Re: OutOfMemoryError

2013-03-14 Thread Shawn Heisey

On 3/14/2013 3:35 AM, Arkadi Colson wrote:

Hi

I'm getting this error after a few hours of filling solr with documents.
Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and hard
commits every minute.
Any idea why this is happening and how to avoid this?


*top*
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45
/usr/bin/java
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties 
-server
-Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0180 6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


As you've already been told, this looks like you have about 80GB of 
index.  I ran into Out Of Memory problems with heavy indexing with a 4GB 
heap on a total index size just a little bit smaller than this.  I had 
to increase the heap size to 8GB.


With heap sizes this large, you'll see garbage collection pause problems 
without careful tuning.  You're probably already having these problems 
with the 4GB heap, but they'll get much worse with an 8GB heap.  Here 
are the memory options I'm using that got rid of my GC pause problem. 
I'm using these with with the Sun/Oracle JVM, on both 1.6 and 1.7:


-Xmx8192M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
-XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled 
-XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts


I notice that you've got options that change the PermSize and 
MaxPermSize.  You probably don't need these options, unless you know 
that you'll run into problems without it.


Additional note: if you have greatly increased RamBufferSizeMB, try 
reducing it to 100, the default on recent versions.  The default used to 
be 32.  Either amount is usually plenty, unless you have huge documents.


Side comment: 12GB total RAM isn't going to be enough memory for top 
performance with 80GB of index.  You'll probably need 8GB of java heap, 
plus between 40 and 80GB of memory for the OS disk cache, to fit a large 
chunk (or all) of your index into RAM.  48GB would be a good start, 64 
to 128GB would be better.


Thanks,
Shawn



Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 1:27 PM, Shawn Heisey  wrote:

> I have been told that it is possible to override the handleError method to 
> fix this

I'd say mitigate more than fix. I think the real fix requires some dev work. 

- Mark

Re: Strange error in Solr 4.2

2013-03-14 Thread Shawn Heisey

On 3/14/2013 9:24 AM, Uwe Klosa wrote:

This exception occurs in this part

new ConcurrentUpdateSolrServer("http://solr.diva-portal.org:8080/search";,
5, 50)


Side comment, unrelated to your question:

If you're already aware that ConcurrentUpdateSolrServer has no built-in 
error handling and you're OK with that, then you don't need to be 
concerned with this message.


ConcurrentUpdateSolrServer swallows any exception that happens during 
its operation.  Errors get logged, but are not passed back to the 
calling application.  Update requests always succeed, even if Solr is 
completely down.


I have been told that it is possible to override the handleError method 
to fix this, but I don't know what code to actually use.


Thanks,
Shawn



Re: Out of Memory doing a query Solr 4.2

2013-03-14 Thread Robert Muir
On Thu, Mar 14, 2013 at 12:07 PM, raulgrande83  wrote:
> JVM: IBM J9 VM(1.6.0.2.4)

I don't recommend using this JVM.


Re: Strange error in Solr 4.2

2013-03-14 Thread Stefan Matheis


On Thursday, March 14, 2013 at 4:57 PM, Uwe Klosa wrote:

> I found the answer myself. Thanks for the pointer.


Would you mind sharing you answer, Uwe? 


ids request to shard with star query are slow

2013-03-14 Thread srinir
ids request to shard with star query are slow

I have  a distributed solr environment and I am investigating all the
request where the shard took significant amount of time. One common pattern
i saw was all the ids request with q=*:* and ids= took around
2-3sec. i picked some shard request q=xyz and ids= and all of them
took only few milliseconds.

I copied the params and manually sent the same request to that particular
shard and again it took around 2.5 sec. But when i removed the query (q=*:*)
parameter and sent the same set of params to the same shard i got the
response back in 10 or millisecond. in both cases the response had the
document i am looking for. 

took 2-3 sec
-
q=*:*&
qt=search&
ids=123&
isShard=true

took 20ms
-
qt=search&
ids=123&
isShard=true

In my understanding ids param is used to get the stored field in a
distributed search. Why does the query parameter (q=) matter here ?

Thanks
Srini



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ids-request-to-shard-with-star-query-are-slow-tp4047395.html
Sent from the Solr - User mailing list archive at Nabble.com.


Out of Memory doing a query Solr 4.2

2013-03-14 Thread raulgrande83
Hi 

After doing a query to Solr to get the uniqueIds (string of 20 characters)
of 700 documents in a collection, I'm getting an out of memory error using
Solr 4.2. I tried to increase the JVM-Memory 1G (from 3G to 4G) however this
didn't change anything.

This was working on 3.5. 

I've moved from 3.5 to 4.2.

Did anyone have the same problem?

Thanks


--

Details :

Solr 4.2
Solr Index 20G aprox.

JVM: IBM J9 VM(1.6.0.2.4)
JVM-Memory:4G
S.O. Linux
Processors 8
RAM: 101G



org.apache.solr.common.SolrException log 
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError 
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
 
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
 
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
 
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
 
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164) 
at
org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(JvmRouteBinderValve.java:218)
 
at
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:333) 
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) 
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:394) 
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:284)
 
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:322)
 
at
org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1714)
 
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:898)
 
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:920) 
at java.lang.Thread.run(Thread.java:736) 
Caused by: java.lang.OutOfMemoryError 
at java.util.Arrays.copyOfRange(Arrays.java:4114) 
at java.util.Arrays.copyOf(Arrays.java:3833) 
at java.lang.StringCoding.safeTrim(StringCoding.java:686) 
at java.lang.StringCoding.access$300(StringCoding.java:41) 
at
java.lang.StringCoding$StringDecoder.decode(StringCoding.java:739) 
at java.lang.StringCoding.decode(StringCoding.java:746) 
at java.lang.String.(String.java:2036) 
at java.lang.String.(String.java:2011) 
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(CompressingStoredFieldsReader.java:143)
 
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:272)
 
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:139) 
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:116)
 
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:436) 
at
org.apache.lucene.document.LazyDocument.getDocument(LazyDocument.java:65) 
at
org.apache.lucene.document.LazyDocument.access$000(LazyDocument.java:36) 
at
org.apache.lucene.document.LazyDocument$LazyField.stringValue(LazyDocument.java:105)
 
at org.apache.solr.schema.FieldType.toExternal(FieldType.java:346) 
at org.apache.solr.schema.FieldType.toObject(FieldType.java:355) 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.getValue(BinaryResponseWriter.java:208)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.getDoc(BinaryResponseWriter.java:186)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:147)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:173)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:86)
 
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:154) 
at
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:144) 
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:234) 
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149) 
at
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:92) 
at
org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:50

Re: Replication

2013-03-14 Thread Timothy Potter
Hi Arkadi,

If the update delta between the shard leader and replica >100 docs, then
Solr punts and replicas the entire index. Last I heard, the 100 was
hard-coded in 4.0 so is not configurable. This makes sense because the
replica shouldn't be out-of-sync with the leader unless it has been offline.

Cheers,
Tim

On Thu, Mar 14, 2013 at 9:05 AM, Arkadi Colson  wrote:

> Based on what does solr replicate the whole shard again from zero? From
> time to time after a restart of tomcat solr copies over the whole shard to
> the replicator instead of doing only the changes.
>
> BR,
> Arkadi
>


Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
I found the answer myself. Thanks for the pointer.

Cheers
Uwe


On 14 March 2013 16:48, Uwe Klosa  wrote:

> Thanks, but nobody has tempered with keystores. I have tested the
> application on different machines. Always the same exception is thrown.
>
> Do we have to set some system property to fix this?
>
> /Uwe
>
>
>
>
> On 14 March 2013 16:36, Mark Miller  wrote:
>
>> Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?
>>
>> Just a guess.
>>
>> The root cause looks to be:
>>
>> > Caused by: java.io.IOException: Keystore was tampered with, or password
>> was
>> > incorrect
>>
>>
>> - Mark
>>
>> On Mar 14, 2013, at 11:24 AM, Uwe Klosa  wrote:
>>
>> > Hi
>> >
>> > We have been using Solr 4.0 for a while now and wanted to upgrade to
>> 4.2.
>> > But our application stopped working. When we tried 4.1 it was working as
>> > expected.
>> >
>> > Here is a  description of the situation.
>> >
>> > We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
>> > server. We added some classes to the standard Solr webapp which are
>> > listening to a jms service and update the index according to the message
>> > content, which can be fetch the document with this id from that URL and
>> add
>> > it to the index. The documents are fetched via SSL from a repository
>> server.
>> >
>> > This has been working well since Solr 1.2 for about 6 years now. With
>> Solr
>> > 4.2 we suddenly get the following error:
>> >
>> > javax.ejb.CreateException: Initialization failed for Singleton
>> > IndexMessageClientFactory
>> >at
>> >
>> com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
>> > ...
>> > Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
>> > initializing default system SSL context
>> >at
>> >
>> org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
>> >at
>> >
>> org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
>> >at
>> >
>> org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
>> >at
>> >
>> org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
>> >at
>> >
>> org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:155)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:132)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:101)
>> >at
>> >
>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:93)
>> >at
>> >
>> diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
>> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >at java.lang.reflect.Method.invoke(Method.java:601)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
>> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >at java.lang.reflect.Method.invoke(Method.java:601)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
>> >at
>> >
>> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
>> >at
>> >
>> com.sun.e

Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Danzig, Scott
Hey all,

We're using a Solr 4 core to handle our article data.  When someone in our CMS 
publishes an article, we have a listener that indexes it straight to solr.  We 
use the previously instantiated HttpSolrServer, build the solr document, add it 
with server.add(doc) .. then do a server.commit() right away.  For some reason, 
sometimes this exception is thrown, which I suspect is related to a 
simultaneous data import done from another client which sometimes errors:

Feb 26, 2013 5:07:51 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1310)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1422)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1200)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is 
closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:550)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:563)
at org.apache.lucene.index.IndexWriter.nrtIsCurrent(IndexWriter.java:4196)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:266)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1256)
... 28 more

I'm not sure if the error is causing the IndexWriter to close, and why an 
IndexWriter would be shared across clients, but usually, I can get around this 
by basically creating a new HttpSolrServer and trying again.  But it doesn't 
always work, perhaps due to frequency… I don't like the idea of an "infinite 
loop of creating connections until it works".  I'd rather understand what's 
going on.  What's the proper way to fix this?  I see I can add a doc with a 
commitWithMs of "0" and maybe this couples the add tightly with the commit and 
would prevent interference.  But am I totally off the mark here as to the 
problem?  Suggestions?

Posted this on java-user before, but then realized solr-user existed, so please 
forgive the redundancy…

Thanks for reading!

- Scott

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Thanks, but nobody has tempered with keystores. I have tested the
application on different machines. Always the same exception is thrown.

Do we have to set some system property to fix this?

/Uwe




On 14 March 2013 16:36, Mark Miller  wrote:

> Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?
>
> Just a guess.
>
> The root cause looks to be:
>
> > Caused by: java.io.IOException: Keystore was tampered with, or password
> was
> > incorrect
>
>
> - Mark
>
> On Mar 14, 2013, at 11:24 AM, Uwe Klosa  wrote:
>
> > Hi
> >
> > We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
> > But our application stopped working. When we tried 4.1 it was working as
> > expected.
> >
> > Here is a  description of the situation.
> >
> > We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
> > server. We added some classes to the standard Solr webapp which are
> > listening to a jms service and update the index according to the message
> > content, which can be fetch the document with this id from that URL and
> add
> > it to the index. The documents are fetched via SSL from a repository
> server.
> >
> > This has been working well since Solr 1.2 for about 6 years now. With
> Solr
> > 4.2 we suddenly get the following error:
> >
> > javax.ejb.CreateException: Initialization failed for Singleton
> > IndexMessageClientFactory
> >at
> >
> com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
> > ...
> > Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
> > initializing default system SSL context
> >at
> >
> org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
> >at
> >
> org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
> >at
> >
> org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
> >at
> >
> org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
> >at
> >
> org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:155)
> >at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:132)
> >at
> >
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:101)
> >at
> >
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:93)
> >at
> > diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >at java.lang.reflect.Method.invoke(Method.java:601)
> >at
> >
> com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
> >at
> >
> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
> >at
> >
> com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
> >at
> >
> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
> >at
> >
> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
> >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >at java.lang.reflect.Method.invoke(Method.java:601)
> >at
> >
> com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
> >at
> >
> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
> >at
> >
> com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
> >at
> >
> com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
> >at
> >
> com.sun.ejb.containers.AbstractSingl

need general advice on how others version and mange core deployments over time

2013-03-14 Thread geeky2
hello everyone,

i know this is a general topic - but would really appreciate info from
others that are doing this now.

  - how are others managing this so that users are impacted the least 
  - how are others handling the scenario where users don't want to migrate
forward.

thx
mark






--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-general-advice-on-how-others-version-and-mange-core-deployments-over-time-tp4047390.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?

Just a guess.

The root cause looks to be:

> Caused by: java.io.IOException: Keystore was tampered with, or password was
> incorrect


- Mark

On Mar 14, 2013, at 11:24 AM, Uwe Klosa  wrote:

> Hi
> 
> We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
> But our application stopped working. When we tried 4.1 it was working as
> expected.
> 
> Here is a  description of the situation.
> 
> We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
> server. We added some classes to the standard Solr webapp which are
> listening to a jms service and update the index according to the message
> content, which can be fetch the document with this id from that URL and add
> it to the index. The documents are fetched via SSL from a repository server.
> 
> This has been working well since Solr 1.2 for about 6 years now. With Solr
> 4.2 we suddenly get the following error:
> 
> javax.ejb.CreateException: Initialization failed for Singleton
> IndexMessageClientFactory
>at
> com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
> ...
> Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
> initializing default system SSL context
>at
> org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
>at
> org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
>at
> org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
>at
> org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
>at
> org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
>at
> org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
>at
> org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
>at
> org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
>at
> org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
>at
> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:155)
>at
> org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:132)
>at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:101)
>at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:93)
>at
> diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:601)
>at
> com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
>at
> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
>at
> com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
>at
> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
>at
> com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>at java.lang.reflect.Method.invoke(Method.java:601)
>at
> com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
>at
> com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
>at
> com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
>at
> com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
>at
> com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:538)
>... 103 more
> Caused by: java.io.IOException: Keystore was tampered with, or password was
> incorrect
>at
> sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772)
>at
> sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55)
>at java.security.KeyStore.load(KeyStore.java:1214)
>at
> org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLS

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
What calls are you using to get the versions? Or is it the admin UI?

Also can you add any details about your setup - if this is a problem, we need 
to duplicate it in one of our unit tests.

Also, is it affecting proper replication in any way that you can tell.

- Mark

On Mar 14, 2013, at 11:12 AM, richardg  wrote:

> I believe this is the same issue as described, I'm running 4.2 and as you can
> see my slave is a couple versions ahead of the master (all three slaves show
> the same behavior).  This was never the case until I upgraded from 4.0 to
> 4.2.
> 
> Master:   
> 1363272681951
> 93
> 1,022.31 MB
> Slave:
> 1363273274085
> 95
> 1,022.31 MB
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-4-1-monitoring-with-solr-replication-command-details-indexVersion-tp4047329p4047380.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Hi

We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
But our application stopped working. When we tried 4.1 it was working as
expected.

Here is a  description of the situation.

We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
server. We added some classes to the standard Solr webapp which are
listening to a jms service and update the index according to the message
content, which can be fetch the document with this id from that URL and add
it to the index. The documents are fetched via SSL from a repository server.

This has been working well since Solr 1.2 for about 6 years now. With Solr
4.2 we suddenly get the following error:

javax.ejb.CreateException: Initialization failed for Singleton
IndexMessageClientFactory
at
com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
...
Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
initializing default system SSL context
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
at
org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
at
org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
at
org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
at
org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
at
org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:155)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.(HttpSolrServer.java:132)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:101)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.(ConcurrentUpdateSolrServer.java:93)
at
diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
at
com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
at
com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
at
com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
at
com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
at
com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
at
com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:538)
... 103 more
Caused by: java.io.IOException: Keystore was tampered with, or password was
incorrect
at
sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772)
at
sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55)
at java.security.KeyStore.load(KeyStore.java:1214)
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281)
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:366)
... 134 more
Caused by: java.security.UnrecoverableKeyException: Password verification
failed
at
sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770)


This exception occurs in this part

new ConcurrentUpdateSolrServer("http://solr.diva-portal.org:8080/search

Re: Question about email search

2013-03-14 Thread Ahmet Arslan
Hi,

Since you have word delimiter filter in your analysis chain, I am not sure if 
e-mail addresses are recognised. You can check that on solr admin UI, analysis 
page. 

If e-mail addresses kept one token, I would use leading wildcard query.
&q=*@gmail.com

There was a similar question recently: 
http://search-lucene.com/m/XF2ejnM6Vi2

--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez  wrote:

> From: Jorge Luis Betancourt Gonzalez 
> Subject: Question about email search
> To: solr-user@lucene.apache.org
> Date: Thursday, March 14, 2013, 5:11 PM
> I'm using solr 3.6.2 to crawl some
> data using nutch, in my schema I've one field with all the
> content extracted from the page, which could possibly
> include email addresses, this is the configuration of my
> schema:
> 
>          class="solr.TextField"
>            
> positionIncrementGap="100"
> autoGeneratePhraseQueries="true">
>              type="index">
>                
> 
>                
> 
>                
> 
>                
>  languange="Spanish"/>
>                
> 
>                
>                 
>     ignoreCase="true" words="stopwords.txt"/>
>                
>                 
>     generateWordParts="1"
> generateNumberParts="1"   
>                
>     catenateWords="1" catenateNumbers="1"
> catenateAll="0"
>                
>     splitOnCaseChange="1"/>
>                
> 
>                
>  class="solr.RemoveDuplicatesTokenFilterFactory"/>
>             
>         
> 
> The thing is that I'm trying to search against a field of
> this type (text) with a value like "@gmail.com" and I'm
> intended to get documents with that text, any advice?
> 
> slds
> --
> "It is only in the mysterious equation of love that any 
> logical reasons can be found."
> "Good programmers often confuse halloween (31 OCT) with 
> christmas (25 DEC)"
> 
>


Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread richardg
I believe this is the same issue as described, I'm running 4.2 and as you can
see my slave is a couple versions ahead of the master (all three slaves show
the same behavior).  This was never the case until I upgraded from 4.0 to
4.2.

Master: 
1363272681951
93
1,022.31 MB
Slave:  
1363273274085
95
1,022.31 MB



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-1-monitoring-with-solr-replication-command-details-indexVersion-tp4047329p4047380.html
Sent from the Solr - User mailing list archive at Nabble.com.


Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
I'm using solr 3.6.2 to crawl some data using nutch, in my schema I've one 
field with all the content extracted from the page, which could possibly 
include email addresses, this is the configuration of my schema:















The thing is that I'm trying to search against a field of this type (text) with 
a value like "@gmail.com" and I'm intended to get documents with that text, any 
advice?

slds
--
"It is only in the mysterious equation of love that any 
logical reasons can be found."
"Good programmers often confuse halloween (31 OCT) with 
christmas (25 DEC)"



Replication

2013-03-14 Thread Arkadi Colson
Based on what does solr replicate the whole shard again from zero? From 
time to time after a restart of tomcat solr copies over the whole shard 
to the replicator instead of doing only the changes.


BR,
Arkadi


Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson


On 03/14/2013 03:11 PM, Toke Eskildsen wrote:

On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote:

When I shutdown tomcat free -m and top keeps telling me the same values.
Almost no free memory...

Any idea?

Are you reading top & free right? It is standard behaviour for most
modern operating systems to have very little free memory. As long as the
sum of free memory and cache is high, everything is fine.

Looking at the stats you gave previously we have


*top*
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
COMMAND 

   &nbs
p;
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45

4.7GB physical memory used and ~80GB used for memory mapping the index.


*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0 180   6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881

So 6648MB used for either general disk cache or memory mapped index.
This really translates to 6648MB (plus the 105MB above) available memory
as any application asking for memory will get it immediately from that
pool (sorry if this is basic stuff for you).


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.(ZipFile.java:127)
 at java.util.zip.ZipFile.(ZipFile.java:144)
 at
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)

[...]


Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack
guard pages failed.
mmap failed for CEN and END part of zip file

A quick search shows that other people have had problems with ZipFile in
at least some sub-versions of Java 1.7. However, another very common
cause for OOM with memory mapping is that the limit for allocating
virtual memory is too low.

We do not index zip files so that could not cause the problem


Try doing a
  ulimit -v
on the machine. If the number is somewhere around 1 (100GB),
Lucene's memory mapping of your index (the 80GB) plus the ZipFile's
memory mapping plus other processes might hit the ceiling. If that is
the case, simply raise the limit.

- Toke


ulimit -v shows me unlimited


I decreased the hard commit time to 10 seconds and set ramBufferSizeMB 
to 250. Hope this helps...

Will keep you informed!

Thanks for the explanation!


Re: Advice: solrCloud + DIH

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 9:22 AM, roySolr  wrote:

> Hello,
> 
>  When i run this it goes with 3 doc/s(Really
> slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. 
> 
> What's the best way to do a full re-index with solrcloud? Does solrcloud
> support DIH?
> 
> Thanks
> 

SolrCloud supports DIH, but not fully and happily. It's setup to work pretty 
nicely with non SolrCloud - it will load pretty quick - with SolrCloud a few 
things can happen - one is that you might be running DIH on a replica rather 
than a leader - and that can change without your consent - in this case all 
docs will go to another node and then come back. SolrCloud also works best with 
multiple threads really - DIH will only use one to my knowledge.

Still, at 3 docs/s, something sounds wrong. That's too slow.

- Mark



Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 8:10 AM, Rafał Radecki  wrote:

> Is this a bug?

Yes, 4.1 had some replication issues just as you seem to describe here. It all 
should be fixed in 4.2 which is available now and is a simple upgrade.

- Mark

Re: OutOfMemoryError

2013-03-14 Thread Toke Eskildsen
On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote:
> When I shutdown tomcat free -m and top keeps telling me the same values. 
> Almost no free memory...
> 
> Any idea?

Are you reading top & free right? It is standard behaviour for most
modern operating systems to have very little free memory. As long as the
sum of free memory and cache is high, everything is fine.

Looking at the stats you gave previously we have

> > *top*
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ 
> > COMMAND 
> > 
> >&nbs 
> > p;
> > 13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 

4.7GB physical memory used and ~80GB used for memory mapping the index.

> > *free -m**
> > * total   used   free shared buffers cached
> > Mem: 12047  11942105  0 180   6363
> > -/+ buffers/cache:   5399   6648
> > Swap:  956 75881

So 6648MB used for either general disk cache or memory mapped index.
This really translates to 6648MB (plus the 105MB above) available memory
as any application asking for memory will get it immediately from that
pool (sorry if this is basic stuff for you).

> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:662)
> > Caused by: java.lang.OutOfMemoryError
> > at java.util.zip.ZipFile.open(Native Method)
> > at java.util.zip.ZipFile.(ZipFile.java:127)
> > at java.util.zip.ZipFile.(ZipFile.java:144)
> > at 
> > org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)

[...]

> > Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
> > guard pages failed.
> > mmap failed for CEN and END part of zip file

A quick search shows that other people have had problems with ZipFile in
at least some sub-versions of Java 1.7. However, another very common
cause for OOM with memory mapping is that the limit for allocating
virtual memory is too low.

Try doing a
 ulimit -v
on the machine. If the number is somewhere around 1 (100GB),
Lucene's memory mapping of your index (the 80GB) plus the ZipFile's
memory mapping plus other processes might hit the ceiling. If that is
the case, simply raise the limit.

- Toke



Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Otis Gospodnetic
Christian,

SSDs will warm up muuuch faster.
Your other questionable require more info / discussion.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Mar 14, 2013 8:47 AM, "Christian von Wendt-Jensen" <
christian.vonwendt-jen...@infopaq.com> wrote:

> Does it only count if you are using SolrCloud? We are using a traditional
> Master/Slave setup with Solr 4.1:
>
> 1 Master per 14 days:
> Documents: ~15mio
> Index size: ~150GB (stored fields)
>
>
> #of masters: +30
> Performance: SUCKS big time until caches catches up. Unfortunately that
> takes quite some time.
>
> Issues:
> #1: Storage: To use SAN or not.
> #2: Cores per instance: what is ideal?
> #3: Size of cores: is 14 days optimal?
> #4: Performance when searching across shards.
> #5: Would SolrCloud be the solution for us?
>
>
>
>
>
> Med venlig hilsen / Best Regards
>
> Christian von Wendt-Jensen
> IT Team Lead, Customer Solutions
>
> Infopaq International A/S
> Kgs. Nytorv 22
> DK-1050 København K
>
> Phone +45 36 99 00 00
> Mobile +45 31 17 10 07
> Email  christian.sonne.jen...@infopaq.com christian.sonne.jen...@infopaq.com>
> Webwww.infopaq.com
>
>
>
>
>
>
>
>
> DISCLAIMER:
> This e-mail and accompanying documents contain privileged confidential
> information. The information is intended only for the recipient(s) named.
> Any unauthorised disclosure, copying, distribution, exploitation or the
> taking of any action in reliance of the content of this e-mail is strictly
> prohibited. If you have received this e-mail in error we would be obliged
> if you would delete the e-mail and attachments and notify the dispatcher by
> return e-mail or at +45 36 99 00 00
> P Please consider the environment before printing this mail note.
>
> From: Annette Newton  annette.new...@servicetick.com>>
> Reply-To: "solr-user@lucene.apache.org"
> mailto:solr-user@lucene.apache.org>>
> Date: Wed, 13 Mar 2013 15:49:34 +0100
> To: "solr-user@lucene.apache.org" <
> solr-user@lucene.apache.org>
> Subject: Re: Poll: Largest SolrCloud out there?
>
> 8 AWS hosts.
> 35GB memory per host
> 10Gb allocated to JVM
> 13 aws compute units per instance
> 4 Shards, 2 replicas
> 25M docs in total
> 22.4GB index per shard
> High writes, low reads
>
>
>
>
> On 13 March 2013 09:12, adm1n  evgeni.evg...@gmail.com>> wrote:
>
> 4 AWS hosts:
> Memory: 30822868k total
> CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x8
> 17M docs
> 5 Gb index.
> 8 master-slave shards (2 shards /host).
> 57 msec/query avg. time. (~110K queries/24 hours).
>
>
>
>
>
> --
> View this message in context:
>
> http://lucene.472066.n3.nabble.com/Poll-Largest-SolrCloud-out-there-tp4043293p4046915.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
>
> --
>
> Annette Newton
>
> Database Administrator
>
> ServiceTick Ltd
>
>
>
> T:+44(0)1603 618326
>
>
>
> Seebohm House, 2-4 Queen Street, Norwich, England NR2 4SQ
>
> www.servicetick.com
>
> *www.sessioncam.com*
>
> --
> *This message is confidential and is intended to be read solely by the
> addressee. The contents should not be disclosed to any other person or
> copies taken unless authorised to do so. If you are not the intended
> recipient, please notify the sender and permanently delete this message. As
> Internet communications are not secure ServiceTick accepts neither legal
> responsibility for the contents of this message nor responsibility for any
> change made to this message after it was forwarded by the original author.*
>
>


Advice: solrCloud + DIH

2013-03-14 Thread roySolr
Hello,

I need some advice with my solrcloud cluster and the DIH. I have a cluster
with 3 cloud servers. Every server has an solr instance and a zookeeper
instance. I start it with the -Dzkhost parameter. It works great, i send
updates by an curl(xml) like this:

curl http:/ip:SOLRport/solr/update -H "Content-Type: text/xml" --data-binary
'223232test'

Solr has 2 million docs in the index. Now i want a extra field: content2. I
add this in my schema and upload this again to the cluster with
-Dbootstrap_confdir and -Dcollection.configName. It's replicated to the
whole cluster.

Now i need a re-index to add the field to every doc. I have a database with
all the data and want to use the full-import of DIH(this was the way i did
this in previous solr versions). When i run this it goes with 3 doc/s(Really
slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. 

What's the best way to do a full re-index with solrcloud? Does solrcloud
support DIH?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Christian von Wendt-Jensen
Does it only count if you are using SolrCloud? We are using a traditional 
Master/Slave setup with Solr 4.1:

1 Master per 14 days:
Documents: ~15mio
Index size: ~150GB (stored fields)


#of masters: +30
Performance: SUCKS big time until caches catches up. Unfortunately that takes 
quite some time.

Issues:
#1: Storage: To use SAN or not.
#2: Cores per instance: what is ideal?
#3: Size of cores: is 14 days optimal?
#4: Performance when searching across shards.
#5: Would SolrCloud be the solution for us?





Med venlig hilsen / Best Regards

Christian von Wendt-Jensen
IT Team Lead, Customer Solutions

Infopaq International A/S
Kgs. Nytorv 22
DK-1050 København K

Phone +45 36 99 00 00
Mobile +45 31 17 10 07
Email  
christian.sonne.jen...@infopaq.com
Webwww.infopaq.com








DISCLAIMER:
This e-mail and accompanying documents contain privileged confidential 
information. The information is intended only for the recipient(s) named. Any 
unauthorised disclosure, copying, distribution, exploitation or the taking of 
any action in reliance of the content of this e-mail is strictly prohibited. If 
you have received this e-mail in error we would be obliged if you would delete 
the e-mail and attachments and notify the dispatcher by return e-mail or at +45 
36 99 00 00
P Please consider the environment before printing this mail note.

From: Annette Newton 
mailto:annette.new...@servicetick.com>>
Reply-To: "solr-user@lucene.apache.org" 
mailto:solr-user@lucene.apache.org>>
Date: Wed, 13 Mar 2013 15:49:34 +0100
To: "solr-user@lucene.apache.org" 
mailto:solr-user@lucene.apache.org>>
Subject: Re: Poll: Largest SolrCloud out there?

8 AWS hosts.
35GB memory per host
10Gb allocated to JVM
13 aws compute units per instance
4 Shards, 2 replicas
25M docs in total
22.4GB index per shard
High writes, low reads




On 13 March 2013 09:12, adm1n 
mailto:evgeni.evg...@gmail.com>> wrote:

4 AWS hosts:
Memory: 30822868k total
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x8
17M docs
5 Gb index.
8 master-slave shards (2 shards /host).
57 msec/query avg. time. (~110K queries/24 hours).





--
View this message in context:
http://lucene.472066.n3.nabble.com/Poll-Largest-SolrCloud-out-there-tp4043293p4046915.html
Sent from the Solr - User mailing list archive at Nabble.com.




--

Annette Newton

Database Administrator

ServiceTick Ltd



T:+44(0)1603 618326



Seebohm House, 2-4 Queen Street, Norwich, England NR2 4SQ

www.servicetick.com

*www.sessioncam.com*

--
*This message is confidential and is intended to be read solely by the
addressee. The contents should not be disclosed to any other person or
copies taken unless authorised to do so. If you are not the intended
recipient, please notify the sender and permanently delete this message. As
Internet communications are not secure ServiceTick accepts neither legal
responsibility for the contents of this message nor responsibility for any
change made to this message after it was forwarded by the original author.*



Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-14 Thread Luis Cappa Banda
Hello!

Thanks a lot, Erick! I've attached some stack traces during a normal
'engine' running.

Cheers,

- Luis Cappa


2013/3/13 Erick Erickson 

> Stack traces..
>
> First,
> jps -l
>
> that will give you a the process IDs of your running Java processes. Then:
>
> jstack 
>
> Usually I pipe the output from jstack into a text file...
>
> Best
> Erick
>
>
> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda  >wrote:
>
> > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
> posible
> > to output this traces, but with a .war application built on top of
> Spring I
> > don´t know how can I do that. In any case, here is my CloudSolrServer
> > wrapper that is used by other classes. There is no sync method or piece
> of
> > code:
> >
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >
> > private static final long serialVersionUID = 3905956120804659445L;
> > public BinaryLBHttpSolrServer(String[] endpoints) throws
> > MalformedURLException {
> > super(endpoints);
> > }
> >
> > @Override
> > protected HttpSolrServer makeServer(String server) throws
> > MalformedURLException {
> > HttpSolrServer solrServer = super.makeServer(server);
> > solrServer.setRequestWriter(new BinaryRequestWriter());
> > return solrServer;
> > }
> > }
> >
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
> >  private CloudSolrServer cloudSolrServer;
> >
> > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
> >
> > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> > endpoints, int clientTimeout,
> > int connectTimeout, String cloudCollection) {
> >  try {
> > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> > (endpoints);
> > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> > lbSolrServer);
> > this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> > this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> > this.cloudSolrServer.setDefaultCollection(cloudCollection);
> >  } catch (MalformedURLException e) {
> > log.error(e);
> > }
> > }
> >
> > @Override
> > public QueryResponse *search*(SolrQuery query) throws
> SolrServerException {
> > return cloudSolrServer.query(query, METHOD.POST);
> > }
> >
> > @Override
> > public boolean *index*(DocumentBean user) {
> > boolean indexed = false;
> > int retries = 0;
> >  do {
> > indexed = addBean(user);
> > retries++;
> >  } while(!indexed && retries<4);
> >  return indexed;
> > }
> >  @Override
> > public boolean *update*(SolrInputDocument updateDoc) {
> > boolean update = false;
> > int retries = 0;
> >
> > do {
> > update = addSolrInputDocument(updateDoc);
> > retries++;
> >  } while(!update && retries<4);
> >  return update;
> > }
> >  @Override
> > public void commit() {
> > try {
> > cloudSolrServer.commit();
> > } catch (SolrServerException e) {
> >  log.error(e);
> > } catch (IOException e) {
> >  log.error(e);
> > }
> > }
> >
> > @Override
> > public boolean *delete*(String ... ids) {
> > boolean deleted = false;
> >  List idList = Arrays.asList(ids);
> >  try {
> > this.cloudSolrServer.deleteById(idList);
> > this.cloudSolrServer.commit(true, true);
> > deleted = true;
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >
> > } catch (IOException e) {
> > log.error(e);
> >  }
> >  return deleted;
> > }
> >
> > @Override
> > public void *optimize*() {
> > try {
> > this.cloudSolrServer.optimize();
> >  } catch (SolrServerException e) {
> > log.error(e);
> >  } catch (IOException e) {
> > log.error(e);
> > }
> > }
> >  /*
> >  * 
> >  *  Getters & setters *
> >  * 
> >  * */
> >  public CloudSolrServer getSolrServer() {
> > return cloudSolrServer;
> > }
> >
> > public void setSolrServer(CloudSolrServer solrServer) {
> > this.cloudSolrServer = solrServer;
> > }
> >
> > private boolean addBean(DocumentBean user) {
> > boolean added = false;
> >  try {
> > this.cloudSolrServer.addBean(user, 100);
> > this.commit();
> >
> > } catch (IOException e) {
> > log.error(e);
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >  }catch(SolrException e) {
> > log.error(e);
> > }
> >  return added;
> > }
> >  private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
> > boolean added = false;
> >  try {
> > this.cloudSolrServer.add(updateDoc, 100);
> > this.commit();
> > added = true;
> >  } catch (IOException e) {
> > log.error(e);
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >  }catch(SolrException e) {
> > log.error(e);
> > }
> >  return added;
> > }
> > }
> >
> > Thank you very much, Mark.
> >
> >
> > -  L

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
In the output of:

/solr/replication?command=details

there is indexVersion mentioned many times:



0
3


22.59 KB
/usr/share/solr/data/index/


1363259880360
4

_1.tvx
_1_nrm.cfs
_1_Lucene41_0.doc
_1_Lucene41_0.tim
_1_Lucene41_0.tip
_1.fnm
_1_nrm.cfe
_1.fdx
_1_Lucene41_0.pos
_1.tvf
_1.fdt
_1_Lucene41_0.pay
_1.si
_1.tvd
segments_4



false
true
1363259808632
3


22.59 KB
/usr/share/solr/data/index/


1363263304585
4

_2_Lucene41_0.pos
_2.si
_2_Lucene41_0.tim
_2.fdt
_2_Lucene41_0.doc
_2_Lucene41_0.tip
_2.fdx
_2.tvx
_2.fnm
_2_nrm.cfe
_2.tvd
_2_Lucene41_0.pay
_2_nrm.cfs
_2.tvf
segments_4



true
false
1363263304585
4

schema.xml,stopwords.txt

commit
startup

false
4


http://172.18.19.204:8080/solr
00:00:60
Polling disabled
Thu Mar 14 12:18:00 CET 2013

Thu Mar 14 12:18:00 CET 2013
Thu Mar 14 12:17:00 CET 2013
Fri Mar 08 14:55:00 CET 2013
Fri Mar 08 14:50:52 CET 2013
Fri Mar 08 14:32:00 CET 2013

5
23214
0
Thu Mar 14 13:15:53 CET 2013
true
false



This response format is experimental. It is likely to change in the future.



Which one should be used? Is there any other way to monitor idex
version on master and slave?

Best regards,
Rafał Radecki.

2013/3/14 Rafał Radecki :
> Hi All.
>
> I am monitoring two solr 4.1 solr instances in master-slave setup. On
> both nodes I check url /solr/replication?command=details and parse it
> to get:
> - on master: if replication is enabled -> field replicationEnabled
> - on slave: if replication is enabled -> field replicationEnabled
> - on slave: if polling is disabled -> field isPollingDisabled
> For solr 3.6 I've als used url:
> solr/replication?command=indexversion
> but for 4.1 it gives me different results on master and slave, on
> slave the version is higher despite the fact that replication is
> enabled, polling is enabled and in admin gui
> /solr/#/collection1/replication I have: Index
> Version Gen Size
> Master:
> 1363259808632
> 3
> 22.59 KB
> Slave:
> 1363259808632
> 3
> 22.59 KB
> So as I see it master and slave have the same version of index despite
> the fact that /solr/replication?command=indexversion gives:
> - on master: 1363259808632
> - on slave: 1363259880360 -> higher value
> Is this a bug?
>
> Best regards,
> Rafal Radecki.


Re: New-Question On Search data who does not have "x" field

2013-03-14 Thread Jack Krupansky
Writing "OR -" is simply the same as "-", so the query would match documents 
containing category 20 and then remove all documents that had any category 
(including 20) specified, giving you nothing.


Try:

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20"; OR 
(*:* -category:[* TO *])


Technically, the following should work, but there have been bugs with pure 
negative queries and sub-queries, so it may or may not work:


http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20"; OR 
(-category:[* TO *])


-- Jack Krupansky

-Original Message- 
From: anurag.jain

Sent: Thursday, March 14, 2013 3:48 AM
To: solr-user@lucene.apache.org
Subject: New-Question On Search data who does not have "x" field

My prev question was

I have updated 250 data to solr.

and some of data have "category" field and some of don't have.

for example.

{
"id":"321",
"name":"anurag",
"category":"30"
},
{
"id":"3",
"name":"john"
}

now i want to search that docs who does not have that field.
what query should like.

I got an answer.

i can use http://localhost:8983/search?q=*:*&fq=-category:[* TO *]


but now i am facing a problem. that i want to search all docs .. who does
not have category field  or category field value = 20

I wrote following query.

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20"; OR
-category:[* TO *]

but it is giving me zero output.

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20";  ->
output = 2689

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=-category:[* TO *]  ->
output = 2644684




what is problem ... am i doing some mistakes ??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-Question-On-Search-data-who-does-not-have-x-field-tp4047270.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
Hi All.

I am monitoring two solr 4.1 solr instances in master-slave setup. On
both nodes I check url /solr/replication?command=details and parse it
to get:
- on master: if replication is enabled -> field replicationEnabled
- on slave: if replication is enabled -> field replicationEnabled
- on slave: if polling is disabled -> field isPollingDisabled
For solr 3.6 I've als used url:
solr/replication?command=indexversion
but for 4.1 it gives me different results on master and slave, on
slave the version is higher despite the fact that replication is
enabled, polling is enabled and in admin gui
/solr/#/collection1/replication I have: Index
Version Gen Size
Master: 
1363259808632
3
22.59 KB
Slave:  
1363259808632
3
22.59 KB
So as I see it master and slave have the same version of index despite
the fact that /solr/replication?command=indexversion gives:
- on master: 1363259808632
- on slave: 1363259880360 -> higher value
Is this a bug?

Best regards,
Rafal Radecki.


Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
When I shutdown tomcat free -m and top keeps telling me the same values. 
Almost no free memory...


Any idea?

On 03/14/2013 10:35 AM, Arkadi Colson wrote:

Hi

I'm getting this error after a few hours of filling solr with 
documents. Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and 
hard commits every minute.

Any idea why this is happening and how to avoid this?


*top*
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ 
COMMAND&nbs 
p;
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 
/usr/bin/java 
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties 
-server -Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m 
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/ 
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java 
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp 
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0 180   6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


*log*
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:462)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:127)
at java.util.zip.ZipFile.(ZipFile.java:144)
at 
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)
at 
org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:101)
at 
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)

... 15 more

Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
guard pages failed.

mmap failed for CEN and END part of zip file



--
Met vriendelijke groeten

Arkadi Colson

Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen
T +32 11 6

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Chantal,

the goal is different: get a general feeling how practical it is to integrate 
this in the routine.
If you are able, on your contemporary machine which I assume is not a 
supercomputer of some special sort, to run this whole process somewhat useful 
for you in about 2 minutes then I'll be very interested.

If, like quite many things where maven starts and integration is measured from 
all facets, it takes more than 15 minutes to run this process, once useful, 
then I will be less motivated.

I'm not asking for performance measurement and certainly not for that of solr 
which I trust largely and depends a lot on good caching. Yes, for this, jMeter 
or others are useful.

Paul


On 14 mars 2013, at 12:20, Chantal Ackermann wrote:

> Hi Paul,
> 
> I'm sorry I cannot provide you with any numbers. I also doubt it would be 
> wise to post any as I think the speed depends highly on what you are doing in 
> your integration tests.
> 
> Say you have several request handlers that you want to test (on different 
> cores), and some more complex use cases like using output from one request 
> handler as input to others. You would also import test data that would be 
> representative enough to test these request handlers and use cases.
> 
> The requests themselves, of course, only take as long as SolrJ takes to run 
> and SOLR takes to answer them.
> In addition, there is the overhead of Maven starting up, running all the 
> plugins, importing the data, executing the tests. Well, Maven is certainly 
> not the fastest tool to start up and get going…
> 
> If you are asking because you want to run rather a lot requests and test 
> their output - JMeter might be preferrable?
> 
> Hope that was not too vague an answer,
> Chantal
> 
> 
> Am 14.03.2013 um 09:51 schrieb Paul Libbrecht:
> 
>> Nice,
>> 
>> Chantal can you indicate there or here what kind of speed for integration 
>> tests you've reached with this, from a bare source to a successfully tested 
>> application?
>> (e.g. with 100 documents)
>> 
>> thanks in advance
>> 
>> Paul
>> 
>> 
>> On 14 mars 2013, at 09:29, Chantal Ackermann wrote:
>> 
>>> Hi all,
>>> 
>>> 
>>> this is not a question. I just wanted to announce that I've written a blog 
>>> post on how to set up Maven for packaging and automatic testing of a SOLR 
>>> index configuration.
>>> 
>>> http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
>>> 
>>> Feedback or comments appreciated!
>>> And again, thanks for that great piece of software.
>>> 
>>> Chantal
>>> 
>> 
> 



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi Paul,

I'm sorry I cannot provide you with any numbers. I also doubt it would be wise 
to post any as I think the speed depends highly on what you are doing in your 
integration tests.

Say you have several request handlers that you want to test (on different 
cores), and some more complex use cases like using output from one request 
handler as input to others. You would also import test data that would be 
representative enough to test these request handlers and use cases.

The requests themselves, of course, only take as long as SolrJ takes to run and 
SOLR takes to answer them.
In addition, there is the overhead of Maven starting up, running all the 
plugins, importing the data, executing the tests. Well, Maven is certainly not 
the fastest tool to start up and get going…

If you are asking because you want to run rather a lot requests and test their 
output - JMeter might be preferrable?

Hope that was not too vague an answer,
Chantal


Am 14.03.2013 um 09:51 schrieb Paul Libbrecht:

> Nice,
> 
> Chantal can you indicate there or here what kind of speed for integration 
> tests you've reached with this, from a bare source to a successfully tested 
> application?
> (e.g. with 100 documents)
> 
> thanks in advance
> 
> Paul
> 
> 
> On 14 mars 2013, at 09:29, Chantal Ackermann wrote:
> 
>> Hi all,
>> 
>> 
>> this is not a question. I just wanted to announce that I've written a blog 
>> post on how to set up Maven for packaging and automatic testing of a SOLR 
>> index configuration.
>> 
>> http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
>> 
>> Feedback or comments appreciated!
>> And again, thanks for that great piece of software.
>> 
>> Chantal
>> 
> 



OutOfMemoryError

2013-03-14 Thread Arkadi Colson

Hi

I'm getting this error after a few hours of filling solr with documents. 
Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and hard 
commits every minute.

Any idea why this is happening and how to avoid this?


*top*
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 
/usr/bin/java 
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties -server 
-Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m 
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/ 
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java 
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp 
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0180 6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


*log*
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:462)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:127)
at java.util.zip.ZipFile.(ZipFile.java:144)
at 
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)
at 
org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:101)
at 
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)

... 15 more

Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
guard pages failed.

mmap failed for CEN and END part of zip file



--
Met vriendelijke groeten

Arkadi Colson

Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen
T +32 11 64 08 80 . F +32 11 64 08 81



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Nice,

Chantal can you indicate there or here what kind of speed for integration tests 
you've reached with this, from a bare source to a successfully tested 
application?
(e.g. with 100 documents)

thanks in advance

Paul


On 14 mars 2013, at 09:29, Chantal Ackermann wrote:

> Hi all,
> 
> 
> this is not a question. I just wanted to announce that I've written a blog 
> post on how to set up Maven for packaging and automatic testing of a SOLR 
> index configuration.
> 
> http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
> 
> Feedback or comments appreciated!
> And again, thanks for that great piece of software.
> 
> Chantal
> 



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread David Philip
Informative. Useful.Thanks


On Thu, Mar 14, 2013 at 1:59 PM, Chantal Ackermann <
c.ackerm...@it-agenten.com> wrote:

> Hi all,
>
>
> this is not a question. I just wanted to announce that I've written a blog
> post on how to set up Maven for packaging and automatic testing of a SOLR
> index configuration.
>
>
> http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
>
> Feedback or comments appreciated!
> And again, thanks for that great piece of software.
>
> Chantal
>
>


Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi all,


this is not a question. I just wanted to announce that I've written a blog post 
on how to set up Maven for packaging and automatic testing of a SOLR index 
configuration.

http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/

Feedback or comments appreciated!
And again, thanks for that great piece of software.

Chantal



Re: Solr Replication

2013-03-14 Thread Ahmet Arslan
Hi Vicky,

May be   startup ?

For backups http://master_host:port/solr/replication?command=backup would be 
more suitable.

or startup


--- On Thu, 3/14/13, vicky desai  wrote:

> From: vicky desai 
> Subject: Solr Replication
> To: solr-user@lucene.apache.org
> Date: Thursday, March 14, 2013, 9:20 AM
> Hi,
> 
> I am using solr 4 setup. For the backup purpose once in a
> day I start one
> additional tomcat server with cores having empty data
> folders and which acts
> as a slave server. However it does not replicate data from
> the master unless
> there is a commit on the master. Is there a possibility to
> pull data from
> master core without firing a commit operation on that core 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
> 


New-Question On Search data who does not have "x" field

2013-03-14 Thread anurag.jain
My prev question was

I have updated 250 data to solr. 

and some of data have "category" field and some of don't have. 

for example. 

{ 
"id":"321", 
"name":"anurag", 
"category":"30" 
}, 
{ 
"id":"3", 
"name":"john" 
} 

now i want to search that docs who does not have that field. 
what query should like. 
 
I got an answer.

i can use http://localhost:8983/search?q=*:*&fq=-category:[* TO *]


but now i am facing a problem. that i want to search all docs .. who does
not have category field  or category field value = 20

I wrote following query. 

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20"; OR
-category:[* TO *]

but it is giving me zero output.

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20";  ->
output = 2689

http://localhost:8983/search?q=*:*&wt=json&start=0&fq=-category:[* TO *]  ->
output = 2644684




what is problem ... am i doing some mistakes ??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-Question-On-Search-data-who-does-not-have-x-field-tp4047270.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Replication

2013-03-14 Thread vicky desai
Hi,

I am using solr 4 setup. For the backup purpose once in a day I start one
additional tomcat server with cores having empty data folders and which acts
as a slave server. However it does not replicate data from the master unless
there is a commit on the master. Is there a possibility to pull data from
master core without firing a commit operation on that core 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266.html
Sent from the Solr - User mailing list archive at Nabble.com.