Solr Replication

2013-03-14 Thread vicky desai
Hi,

I am using solr 4 setup. For the backup purpose once in a day I start one
additional tomcat server with cores having empty data folders and which acts
as a slave server. However it does not replicate data from the master unless
there is a commit on the master. Is there a possibility to pull data from
master core without firing a commit operation on that core 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266.html
Sent from the Solr - User mailing list archive at Nabble.com.


New-Question On Search data who does not have x field

2013-03-14 Thread anurag.jain
My prev question was

I have updated 250 data to solr. 

and some of data have category field and some of don't have. 

for example. 

{ 
id:321, 
name:anurag, 
category:30 
}, 
{ 
id:3, 
name:john 
} 

now i want to search that docs who does not have that field. 
what query should like. 
 
I got an answer.

i can use http://localhost:8983/search?q=*:*fq=-category:[* TO *]


but now i am facing a problem. that i want to search all docs .. who does
not have category field  or category field value = 20

I wrote following query. 

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20; OR
-category:[* TO *]

but it is giving me zero output.

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20;  -
output = 2689

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=-category:[* TO *]  -
output = 2644684




what is problem ... am i doing some mistakes ??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-Question-On-Search-data-who-does-not-have-x-field-tp4047270.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Replication

2013-03-14 Thread Ahmet Arslan
Hi Vicky,

May be   str name=replicateAfterstartup/str ?

For backups http://master_host:port/solr/replication?command=backup would be 
more suitable.

or str name=backupAfterstartup/str


--- On Thu, 3/14/13, vicky desai vicky.de...@germinait.com wrote:

 From: vicky desai vicky.de...@germinait.com
 Subject: Solr Replication
 To: solr-user@lucene.apache.org
 Date: Thursday, March 14, 2013, 9:20 AM
 Hi,
 
 I am using solr 4 setup. For the backup purpose once in a
 day I start one
 additional tomcat server with cores having empty data
 folders and which acts
 as a slave server. However it does not replicate data from
 the master unless
 there is a commit on the master. Is there a possibility to
 pull data from
 master core without firing a commit operation on that core 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266.html
 Sent from the Solr - User mailing list archive at
 Nabble.com.
 


Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi all,


this is not a question. I just wanted to announce that I've written a blog post 
on how to set up Maven for packaging and automatic testing of a SOLR index 
configuration.

http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/

Feedback or comments appreciated!
And again, thanks for that great piece of software.

Chantal



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread David Philip
Informative. Useful.Thanks


On Thu, Mar 14, 2013 at 1:59 PM, Chantal Ackermann 
c.ackerm...@it-agenten.com wrote:

 Hi all,


 this is not a question. I just wanted to announce that I've written a blog
 post on how to set up Maven for packaging and automatic testing of a SOLR
 index configuration.


 http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/

 Feedback or comments appreciated!
 And again, thanks for that great piece of software.

 Chantal




Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Nice,

Chantal can you indicate there or here what kind of speed for integration tests 
you've reached with this, from a bare source to a successfully tested 
application?
(e.g. with 100 documents)

thanks in advance

Paul


On 14 mars 2013, at 09:29, Chantal Ackermann wrote:

 Hi all,
 
 
 this is not a question. I just wanted to announce that I've written a blog 
 post on how to set up Maven for packaging and automatic testing of a SOLR 
 index configuration.
 
 http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
 
 Feedback or comments appreciated!
 And again, thanks for that great piece of software.
 
 Chantal
 



OutOfMemoryError

2013-03-14 Thread Arkadi Colson

Hi

I'm getting this error after a few hours of filling solr with documents. 
Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and hard 
commits every minute.

Any idea why this is happening and how to avoid this?


*top*
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 
/usr/bin/java 
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties -server 
-Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m 
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/ 
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java 
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp 
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0180 6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


*log*
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:462)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:127)
at java.util.zip.ZipFile.init(ZipFile.java:144)
at 
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)
at 
org.apache.poi.openxml4j.opc.ZipPackage.init(ZipPackage.java:101)
at 
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)

... 15 more

Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
guard pages failed.

mmap failed for CEN and END part of zip file



--
Met vriendelijke groeten

Arkadi Colson

Smartbit bvba . Hoogstraat 13 . 3670 Meeuwen
T +32 11 64 08 80 . F +32 11 64 08 81



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi Paul,

I'm sorry I cannot provide you with any numbers. I also doubt it would be wise 
to post any as I think the speed depends highly on what you are doing in your 
integration tests.

Say you have several request handlers that you want to test (on different 
cores), and some more complex use cases like using output from one request 
handler as input to others. You would also import test data that would be 
representative enough to test these request handlers and use cases.

The requests themselves, of course, only take as long as SolrJ takes to run and 
SOLR takes to answer them.
In addition, there is the overhead of Maven starting up, running all the 
plugins, importing the data, executing the tests. Well, Maven is certainly not 
the fastest tool to start up and get going…

If you are asking because you want to run rather a lot requests and test their 
output - JMeter might be preferrable?

Hope that was not too vague an answer,
Chantal


Am 14.03.2013 um 09:51 schrieb Paul Libbrecht:

 Nice,
 
 Chantal can you indicate there or here what kind of speed for integration 
 tests you've reached with this, from a bare source to a successfully tested 
 application?
 (e.g. with 100 documents)
 
 thanks in advance
 
 Paul
 
 
 On 14 mars 2013, at 09:29, Chantal Ackermann wrote:
 
 Hi all,
 
 
 this is not a question. I just wanted to announce that I've written a blog 
 post on how to set up Maven for packaging and automatic testing of a SOLR 
 index configuration.
 
 http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
 
 Feedback or comments appreciated!
 And again, thanks for that great piece of software.
 
 Chantal
 
 



Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Chantal,

the goal is different: get a general feeling how practical it is to integrate 
this in the routine.
If you are able, on your contemporary machine which I assume is not a 
supercomputer of some special sort, to run this whole process somewhat useful 
for you in about 2 minutes then I'll be very interested.

If, like quite many things where maven starts and integration is measured from 
all facets, it takes more than 15 minutes to run this process, once useful, 
then I will be less motivated.

I'm not asking for performance measurement and certainly not for that of solr 
which I trust largely and depends a lot on good caching. Yes, for this, jMeter 
or others are useful.

Paul


On 14 mars 2013, at 12:20, Chantal Ackermann wrote:

 Hi Paul,
 
 I'm sorry I cannot provide you with any numbers. I also doubt it would be 
 wise to post any as I think the speed depends highly on what you are doing in 
 your integration tests.
 
 Say you have several request handlers that you want to test (on different 
 cores), and some more complex use cases like using output from one request 
 handler as input to others. You would also import test data that would be 
 representative enough to test these request handlers and use cases.
 
 The requests themselves, of course, only take as long as SolrJ takes to run 
 and SOLR takes to answer them.
 In addition, there is the overhead of Maven starting up, running all the 
 plugins, importing the data, executing the tests. Well, Maven is certainly 
 not the fastest tool to start up and get going…
 
 If you are asking because you want to run rather a lot requests and test 
 their output - JMeter might be preferrable?
 
 Hope that was not too vague an answer,
 Chantal
 
 
 Am 14.03.2013 um 09:51 schrieb Paul Libbrecht:
 
 Nice,
 
 Chantal can you indicate there or here what kind of speed for integration 
 tests you've reached with this, from a bare source to a successfully tested 
 application?
 (e.g. with 100 documents)
 
 thanks in advance
 
 Paul
 
 
 On 14 mars 2013, at 09:29, Chantal Ackermann wrote:
 
 Hi all,
 
 
 this is not a question. I just wanted to announce that I've written a blog 
 post on how to set up Maven for packaging and automatic testing of a SOLR 
 index configuration.
 
 http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/
 
 Feedback or comments appreciated!
 And again, thanks for that great piece of software.
 
 Chantal
 
 
 



Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
When I shutdown tomcat free -m and top keeps telling me the same values. 
Almost no free memory...


Any idea?

On 03/14/2013 10:35 AM, Arkadi Colson wrote:

Hi

I'm getting this error after a few hours of filling solr with 
documents. Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and 
hard commits every minute.

Any idea why this is happening and how to avoid this?


*top*
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ 
COMMANDnbs 
p;
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 
/usr/bin/java 
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties 
-server -Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m 
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/ 
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java 
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp 
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0 180   6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


*log*
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:462)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.init(ZipFile.java:127)
at java.util.zip.ZipFile.init(ZipFile.java:144)
at 
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)
at 
org.apache.poi.openxml4j.opc.ZipPackage.init(ZipPackage.java:101)
at 
org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
at 
org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
at 
org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)

... 15 more

Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
guard pages failed.

mmap failed for CEN and END part of zip file



--
Met vriendelijke groeten

Arkadi Colson

Smartbit bvba . Hoogstraat 13 . 3670 

Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
Hi All.

I am monitoring two solr 4.1 solr instances in master-slave setup. On
both nodes I check url /solr/replication?command=details and parse it
to get:
- on master: if replication is enabled - field replicationEnabled
- on slave: if replication is enabled - field replicationEnabled
- on slave: if polling is disabled - field isPollingDisabled
For solr 3.6 I've als used url:
solr/replication?command=indexversion
but for 4.1 it gives me different results on master and slave, on
slave the version is higher despite the fact that replication is
enabled, polling is enabled and in admin gui
/solr/#/collection1/replication I have: Index
Version Gen Size
Master: 
1363259808632
3
22.59 KB
Slave:  
1363259808632
3
22.59 KB
So as I see it master and slave have the same version of index despite
the fact that /solr/replication?command=indexversion gives:
- on master: long name=indexversion1363259808632/long
- on slave: long name=indexversion1363259880360/long - higher value
Is this a bug?

Best regards,
Rafal Radecki.


Re: New-Question On Search data who does not have x field

2013-03-14 Thread Jack Krupansky
Writing OR - is simply the same as -, so the query would match documents 
containing category 20 and then remove all documents that had any category 
(including 20) specified, giving you nothing.


Try:

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20; OR 
(*:* -category:[* TO *])


Technically, the following should work, but there have been bugs with pure 
negative queries and sub-queries, so it may or may not work:


http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20; OR 
(-category:[* TO *])


-- Jack Krupansky

-Original Message- 
From: anurag.jain

Sent: Thursday, March 14, 2013 3:48 AM
To: solr-user@lucene.apache.org
Subject: New-Question On Search data who does not have x field

My prev question was

I have updated 250 data to solr.

and some of data have category field and some of don't have.

for example.

{
id:321,
name:anurag,
category:30
},
{
id:3,
name:john
}

now i want to search that docs who does not have that field.
what query should like.

I got an answer.

i can use http://localhost:8983/search?q=*:*fq=-category:[* TO *]


but now i am facing a problem. that i want to search all docs .. who does
not have category field  or category field value = 20

I wrote following query.

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20; OR
-category:[* TO *]

but it is giving me zero output.

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=category:20;  -
output = 2689

http://localhost:8983/search?q=*:*wt=jsonstart=0fq=-category:[* TO *]  -
output = 2644684




what is problem ... am i doing some mistakes ??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-Question-On-Search-data-who-does-not-have-x-field-tp4047270.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
In the output of:

/solr/replication?command=details

there is indexVersion mentioned many times:

response
lst name=responseHeader
int name=status0/int
int name=QTime3/int
/lst
lst name=details
str name=indexSize22.59 KB/str
str name=indexPath/usr/share/solr/data/index//str
arr name=commits
lst
long name=indexVersion1363259880360/long
long name=generation4/long
arr name=filelist
str_1.tvx/str
str_1_nrm.cfs/str
str_1_Lucene41_0.doc/str
str_1_Lucene41_0.tim/str
str_1_Lucene41_0.tip/str
str_1.fnm/str
str_1_nrm.cfe/str
str_1.fdx/str
str_1_Lucene41_0.pos/str
str_1.tvf/str
str_1.fdt/str
str_1_Lucene41_0.pay/str
str_1.si/str
str_1.tvd/str
strsegments_4/str
/arr
/lst
/arr
str name=isMasterfalse/str
str name=isSlavetrue/str
long name=indexVersion1363259808632/long
long name=generation3/long
lst name=slave
lst name=masterDetails
str name=indexSize22.59 KB/str
str name=indexPath/usr/share/solr/data/index//str
arr name=commits
lst
long name=indexVersion1363263304585/long
long name=generation4/long
arr name=filelist
str_2_Lucene41_0.pos/str
str_2.si/str
str_2_Lucene41_0.tim/str
str_2.fdt/str
str_2_Lucene41_0.doc/str
str_2_Lucene41_0.tip/str
str_2.fdx/str
str_2.tvx/str
str_2.fnm/str
str_2_nrm.cfe/str
str_2.tvd/str
str_2_Lucene41_0.pay/str
str_2_nrm.cfs/str
str_2.tvf/str
strsegments_4/str
/arr
/lst
/arr
str name=isMastertrue/str
str name=isSlavefalse/str
long name=indexVersion1363263304585/long
long name=generation4/long
lst name=master
str name=confFilesschema.xml,stopwords.txt/str
arr name=replicateAfter
strcommit/str
strstartup/str
/arr
str name=replicationEnabledfalse/str
long name=replicatableGeneration4/long
/lst
/lst
str name=masterUrlhttp://172.18.19.204:8080/solr/str
str name=pollInterval00:00:60/str
str name=nextExecutionAtPolling disabled/str
str name=indexReplicatedAtThu Mar 14 12:18:00 CET 2013/str
arr name=indexReplicatedAtList
strThu Mar 14 12:18:00 CET 2013/str
strThu Mar 14 12:17:00 CET 2013/str
strFri Mar 08 14:55:00 CET 2013/str
strFri Mar 08 14:50:52 CET 2013/str
strFri Mar 08 14:32:00 CET 2013/str
/arr
str name=timesIndexReplicated5/str
str name=lastCycleBytesDownloaded23214/str
str name=previousCycleTimeInSeconds0/str
str name=currentDateThu Mar 14 13:15:53 CET 2013/str
str name=isPollingDisabledtrue/str
str name=isReplicatingfalse/str
/lst
/lst
str name=WARNING
This response format is experimental. It is likely to change in the future.
/str
/response

Which one should be used? Is there any other way to monitor idex
version on master and slave?

Best regards,
Rafał Radecki.

2013/3/14 Rafał Radecki radecki.ra...@gmail.com:
 Hi All.

 I am monitoring two solr 4.1 solr instances in master-slave setup. On
 both nodes I check url /solr/replication?command=details and parse it
 to get:
 - on master: if replication is enabled - field replicationEnabled
 - on slave: if replication is enabled - field replicationEnabled
 - on slave: if polling is disabled - field isPollingDisabled
 For solr 3.6 I've als used url:
 solr/replication?command=indexversion
 but for 4.1 it gives me different results on master and slave, on
 slave the version is higher despite the fact that replication is
 enabled, polling is enabled and in admin gui
 /solr/#/collection1/replication I have: Index
 Version Gen Size
 Master:
 1363259808632
 3
 22.59 KB
 Slave:
 1363259808632
 3
 22.59 KB
 So as I see it master and slave have the same version of index despite
 the fact that /solr/replication?command=indexversion gives:
 - on master: long name=indexversion1363259808632/long
 - on slave: long name=indexversion1363259880360/long - higher value
 Is this a bug?

 Best regards,
 Rafal Radecki.


Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-14 Thread Luis Cappa Banda
Hello!

Thanks a lot, Erick! I've attached some stack traces during a normal
'engine' running.

Cheers,

- Luis Cappa


2013/3/13 Erick Erickson erickerick...@gmail.com

 Stack traces..

 First,
 jps -l

 that will give you a the process IDs of your running Java processes. Then:

 jstack pid from above

 Usually I pipe the output from jstack into a text file...

 Best
 Erick


 On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda luisca...@gmail.com
 wrote:

  Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
 posible
  to output this traces, but with a .war application built on top of
 Spring I
  don´t know how can I do that. In any case, here is my CloudSolrServer
  wrapper that is used by other classes. There is no sync method or piece
 of
  code:
 
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 -
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
  *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
 
  private static final long serialVersionUID = 3905956120804659445L;
  public BinaryLBHttpSolrServer(String[] endpoints) throws
  MalformedURLException {
  super(endpoints);
  }
 
  @Override
  protected HttpSolrServer makeServer(String server) throws
  MalformedURLException {
  HttpSolrServer solrServer = super.makeServer(server);
  solrServer.setRequestWriter(new BinaryRequestWriter());
  return solrServer;
  }
  }
 
   - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 -
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 
  *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
   private CloudSolrServer cloudSolrServer;
 
  private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
 
  public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
  endpoints, int clientTimeout,
  int connectTimeout, String cloudCollection) {
   try {
  BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
  (endpoints);
  this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
  lbSolrServer);
  this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
  this.cloudSolrServer.setZkClientTimeout(clientTimeout);
  this.cloudSolrServer.setDefaultCollection(cloudCollection);
   } catch (MalformedURLException e) {
  log.error(e);
  }
  }
 
  @Override
  public QueryResponse *search*(SolrQuery query) throws
 SolrServerException {
  return cloudSolrServer.query(query, METHOD.POST);
  }
 
  @Override
  public boolean *index*(DocumentBean user) {
  boolean indexed = false;
  int retries = 0;
   do {
  indexed = addBean(user);
  retries++;
   } while(!indexed  retries4);
   return indexed;
  }
   @Override
  public boolean *update*(SolrInputDocument updateDoc) {
  boolean update = false;
  int retries = 0;
 
  do {
  update = addSolrInputDocument(updateDoc);
  retries++;
   } while(!update  retries4);
   return update;
  }
   @Override
  public void commit() {
  try {
  cloudSolrServer.commit();
  } catch (SolrServerException e) {
   log.error(e);
  } catch (IOException e) {
   log.error(e);
  }
  }
 
  @Override
  public boolean *delete*(String ... ids) {
  boolean deleted = false;
   ListString idList = Arrays.asList(ids);
   try {
  this.cloudSolrServer.deleteById(idList);
  this.cloudSolrServer.commit(true, true);
  deleted = true;
 
  } catch (SolrServerException e) {
  log.error(e);
 
  } catch (IOException e) {
  log.error(e);
   }
   return deleted;
  }
 
  @Override
  public void *optimize*() {
  try {
  this.cloudSolrServer.optimize();
   } catch (SolrServerException e) {
  log.error(e);
   } catch (IOException e) {
  log.error(e);
  }
  }
   /*
   * 
   *  Getters  setters *
   * 
   * */
   public CloudSolrServer getSolrServer() {
  return cloudSolrServer;
  }
 
  public void setSolrServer(CloudSolrServer solrServer) {
  this.cloudSolrServer = solrServer;
  }
 
  private boolean addBean(DocumentBean user) {
  boolean added = false;
   try {
  this.cloudSolrServer.addBean(user, 100);
  this.commit();
 
  } catch (IOException e) {
  log.error(e);
 
  } catch (SolrServerException e) {
  log.error(e);
   }catch(SolrException e) {
  log.error(e);
  }
   return added;
  }
   private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
  boolean added = false;
   try {
  this.cloudSolrServer.add(updateDoc, 100);
  this.commit();
  added = true;
   } catch (IOException e) {
  log.error(e);
 
  } catch (SolrServerException e) {
  log.error(e);
   }catch(SolrException e) {
  log.error(e);
  }
   return added;
  }
  }
 
  Thank you very much, Mark.
 
 
  -  Luis Cappa
 
 
 
  And
  2013/3/13 Mark Miller markrmil...@gmail.com
 
  
   Could you capture some thread stack traces in the 'engine' and see if
   there are any blocking methods?
  
   - Mark
  
   On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda luisca...@gmail.com
  wrote:
  
Just one 

Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Christian von Wendt-Jensen
Does it only count if you are using SolrCloud? We are using a traditional 
Master/Slave setup with Solr 4.1:

1 Master per 14 days:
Documents: ~15mio
Index size: ~150GB (stored fields)


#of masters: +30
Performance: SUCKS big time until caches catches up. Unfortunately that takes 
quite some time.

Issues:
#1: Storage: To use SAN or not.
#2: Cores per instance: what is ideal?
#3: Size of cores: is 14 days optimal?
#4: Performance when searching across shards.
#5: Would SolrCloud be the solution for us?





Med venlig hilsen / Best Regards

Christian von Wendt-Jensen
IT Team Lead, Customer Solutions

Infopaq International A/S
Kgs. Nytorv 22
DK-1050 København K

Phone +45 36 99 00 00
Mobile +45 31 17 10 07
Email  
christian.sonne.jen...@infopaq.commailto:christian.sonne.jen...@infopaq.com
Webwww.infopaq.comhttp://www.infopaq.com/








DISCLAIMER:
This e-mail and accompanying documents contain privileged confidential 
information. The information is intended only for the recipient(s) named. Any 
unauthorised disclosure, copying, distribution, exploitation or the taking of 
any action in reliance of the content of this e-mail is strictly prohibited. If 
you have received this e-mail in error we would be obliged if you would delete 
the e-mail and attachments and notify the dispatcher by return e-mail or at +45 
36 99 00 00
P Please consider the environment before printing this mail note.

From: Annette Newton 
annette.new...@servicetick.commailto:annette.new...@servicetick.com
Reply-To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
Date: Wed, 13 Mar 2013 15:49:34 +0100
To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
Subject: Re: Poll: Largest SolrCloud out there?

8 AWS hosts.
35GB memory per host
10Gb allocated to JVM
13 aws compute units per instance
4 Shards, 2 replicas
25M docs in total
22.4GB index per shard
High writes, low reads




On 13 March 2013 09:12, adm1n 
evgeni.evg...@gmail.commailto:evgeni.evg...@gmail.com wrote:

4 AWS hosts:
Memory: 30822868k total
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x8
17M docs
5 Gb index.
8 master-slave shards (2 shards /host).
57 msec/query avg. time. (~110K queries/24 hours).





--
View this message in context:
http://lucene.472066.n3.nabble.com/Poll-Largest-SolrCloud-out-there-tp4043293p4046915.html
Sent from the Solr - User mailing list archive at Nabble.com.




--

Annette Newton

Database Administrator

ServiceTick Ltd



T:+44(0)1603 618326



Seebohm House, 2-4 Queen Street, Norwich, England NR2 4SQ

www.servicetick.com

*www.sessioncam.com*

--
*This message is confidential and is intended to be read solely by the
addressee. The contents should not be disclosed to any other person or
copies taken unless authorised to do so. If you are not the intended
recipient, please notify the sender and permanently delete this message. As
Internet communications are not secure ServiceTick accepts neither legal
responsibility for the contents of this message nor responsibility for any
change made to this message after it was forwarded by the original author.*



Advice: solrCloud + DIH

2013-03-14 Thread roySolr
Hello,

I need some advice with my solrcloud cluster and the DIH. I have a cluster
with 3 cloud servers. Every server has an solr instance and a zookeeper
instance. I start it with the -Dzkhost parameter. It works great, i send
updates by an curl(xml) like this:

curl http:/ip:SOLRport/solr/update -H Content-Type: text/xml --data-binary
'adddocfield name=id223232/fieldfield
name=contenttest/field/doc/add'

Solr has 2 million docs in the index. Now i want a extra field: content2. I
add this in my schema and upload this again to the cluster with
-Dbootstrap_confdir and -Dcollection.configName. It's replicated to the
whole cluster.

Now i need a re-index to add the field to every doc. I have a database with
all the data and want to use the full-import of DIH(this was the way i did
this in previous solr versions). When i run this it goes with 3 doc/s(Really
slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. 

What's the best way to do a full re-index with solrcloud? Does solrcloud
support DIH?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Otis Gospodnetic
Christian,

SSDs will warm up muuuch faster.
Your other questionable require more info / discussion.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Mar 14, 2013 8:47 AM, Christian von Wendt-Jensen 
christian.vonwendt-jen...@infopaq.com wrote:

 Does it only count if you are using SolrCloud? We are using a traditional
 Master/Slave setup with Solr 4.1:

 1 Master per 14 days:
 Documents: ~15mio
 Index size: ~150GB (stored fields)


 #of masters: +30
 Performance: SUCKS big time until caches catches up. Unfortunately that
 takes quite some time.

 Issues:
 #1: Storage: To use SAN or not.
 #2: Cores per instance: what is ideal?
 #3: Size of cores: is 14 days optimal?
 #4: Performance when searching across shards.
 #5: Would SolrCloud be the solution for us?





 Med venlig hilsen / Best Regards

 Christian von Wendt-Jensen
 IT Team Lead, Customer Solutions

 Infopaq International A/S
 Kgs. Nytorv 22
 DK-1050 København K

 Phone +45 36 99 00 00
 Mobile +45 31 17 10 07
 Email  christian.sonne.jen...@infopaq.commailto:
 christian.sonne.jen...@infopaq.com
 Webwww.infopaq.comhttp://www.infopaq.com/








 DISCLAIMER:
 This e-mail and accompanying documents contain privileged confidential
 information. The information is intended only for the recipient(s) named.
 Any unauthorised disclosure, copying, distribution, exploitation or the
 taking of any action in reliance of the content of this e-mail is strictly
 prohibited. If you have received this e-mail in error we would be obliged
 if you would delete the e-mail and attachments and notify the dispatcher by
 return e-mail or at +45 36 99 00 00
 P Please consider the environment before printing this mail note.

 From: Annette Newton annette.new...@servicetick.commailto:
 annette.new...@servicetick.com
 Reply-To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 Date: Wed, 13 Mar 2013 15:49:34 +0100
 To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org 
 solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
 Subject: Re: Poll: Largest SolrCloud out there?

 8 AWS hosts.
 35GB memory per host
 10Gb allocated to JVM
 13 aws compute units per instance
 4 Shards, 2 replicas
 25M docs in total
 22.4GB index per shard
 High writes, low reads




 On 13 March 2013 09:12, adm1n evgeni.evg...@gmail.commailto:
 evgeni.evg...@gmail.com wrote:

 4 AWS hosts:
 Memory: 30822868k total
 CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz x8
 17M docs
 5 Gb index.
 8 master-slave shards (2 shards /host).
 57 msec/query avg. time. (~110K queries/24 hours).





 --
 View this message in context:

 http://lucene.472066.n3.nabble.com/Poll-Largest-SolrCloud-out-there-tp4043293p4046915.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --

 Annette Newton

 Database Administrator

 ServiceTick Ltd



 T:+44(0)1603 618326



 Seebohm House, 2-4 Queen Street, Norwich, England NR2 4SQ

 www.servicetick.com

 *www.sessioncam.com*

 --
 *This message is confidential and is intended to be read solely by the
 addressee. The contents should not be disclosed to any other person or
 copies taken unless authorised to do so. If you are not the intended
 recipient, please notify the sender and permanently delete this message. As
 Internet communications are not secure ServiceTick accepts neither legal
 responsibility for the contents of this message nor responsibility for any
 change made to this message after it was forwarded by the original author.*




Re: OutOfMemoryError

2013-03-14 Thread Toke Eskildsen
On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote:
 When I shutdown tomcat free -m and top keeps telling me the same values. 
 Almost no free memory...
 
 Any idea?

Are you reading top  free right? It is standard behaviour for most
modern operating systems to have very little free memory. As long as the
sum of free memory and cache is high, everything is fine.

Looking at the stats you gave previously we have

  *top*
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ 
  COMMAND 
  
 nbs 
  p;
  13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45 

4.7GB physical memory used and ~80GB used for memory mapping the index.

  *free -m**
  * total   used   free shared buffers cached
  Mem: 12047  11942105  0 180   6363
  -/+ buffers/cache:   5399   6648
  Swap:  956 75881

So 6648MB used for either general disk cache or memory mapped index.
This really translates to 6648MB (plus the 105MB above) available memory
as any application asking for memory will get it immediately from that
pool (sorry if this is basic stuff for you).

  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
  Caused by: java.lang.OutOfMemoryError
  at java.util.zip.ZipFile.open(Native Method)
  at java.util.zip.ZipFile.init(ZipFile.java:127)
  at java.util.zip.ZipFile.init(ZipFile.java:144)
  at 
  org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)

[...]

  Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack 
  guard pages failed.
  mmap failed for CEN and END part of zip file

A quick search shows that other people have had problems with ZipFile in
at least some sub-versions of Java 1.7. However, another very common
cause for OOM with memory mapping is that the limit for allocating
virtual memory is too low.

Try doing a
 ulimit -v
on the machine. If the number is somewhere around 1 (100GB),
Lucene's memory mapping of your index (the 80GB) plus the ZipFile's
memory mapping plus other processes might hit the ceiling. If that is
the case, simply raise the limit.

- Toke



Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 8:10 AM, Rafał Radecki radecki.ra...@gmail.com wrote:

 Is this a bug?

Yes, 4.1 had some replication issues just as you seem to describe here. It all 
should be fixed in 4.2 which is available now and is a simple upgrade.

- Mark

Re: Advice: solrCloud + DIH

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 9:22 AM, roySolr royrutten1...@gmail.com wrote:

 Hello,
 
  When i run this it goes with 3 doc/s(Really
 slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. 
 
 What's the best way to do a full re-index with solrcloud? Does solrcloud
 support DIH?
 
 Thanks
 

SolrCloud supports DIH, but not fully and happily. It's setup to work pretty 
nicely with non SolrCloud - it will load pretty quick - with SolrCloud a few 
things can happen - one is that you might be running DIH on a replica rather 
than a leader - and that can change without your consent - in this case all 
docs will go to another node and then come back. SolrCloud also works best with 
multiple threads really - DIH will only use one to my knowledge.

Still, at 3 docs/s, something sounds wrong. That's too slow.

- Mark



Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson


On 03/14/2013 03:11 PM, Toke Eskildsen wrote:

On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote:

When I shutdown tomcat free -m and top keeps telling me the same values.
Almost no free memory...

Any idea?

Are you reading top  free right? It is standard behaviour for most
modern operating systems to have very little free memory. As long as the
sum of free memory and cache is high, everything is fine.

Looking at the stats you gave previously we have


*top*
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
COMMAND 

   nbs
p;
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45

4.7GB physical memory used and ~80GB used for memory mapping the index.


*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0 180   6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881

So 6648MB used for either general disk cache or memory mapped index.
This really translates to 6648MB (plus the 105MB above) available memory
as any application asking for memory will get it immediately from that
pool (sorry if this is basic stuff for you).


java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:127)
 at java.util.zip.ZipFile.init(ZipFile.java:144)
 at
org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)

[...]


Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack
guard pages failed.
mmap failed for CEN and END part of zip file

A quick search shows that other people have had problems with ZipFile in
at least some sub-versions of Java 1.7. However, another very common
cause for OOM with memory mapping is that the limit for allocating
virtual memory is too low.

We do not index zip files so that could not cause the problem


Try doing a
  ulimit -v
on the machine. If the number is somewhere around 1 (100GB),
Lucene's memory mapping of your index (the 80GB) plus the ZipFile's
memory mapping plus other processes might hit the ceiling. If that is
the case, simply raise the limit.

- Toke


ulimit -v shows me unlimited


I decreased the hard commit time to 10 seconds and set ramBufferSizeMB 
to 250. Hope this helps...

Will keep you informed!

Thanks for the explanation!


Replication

2013-03-14 Thread Arkadi Colson
Based on what does solr replicate the whole shard again from zero? From 
time to time after a restart of tomcat solr copies over the whole shard 
to the replicator instead of doing only the changes.


BR,
Arkadi


Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
I'm using solr 3.6.2 to crawl some data using nutch, in my schema I've one 
field with all the content extracted from the page, which could possibly 
include email addresses, this is the configuration of my schema:

fieldType name=text class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=true
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StandardFilterFactory/
filter class=solr.ISOLatin1AccentFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
languange=Spanish/
charFilter class=solr.HTMLStripCharFilterFactory/
filter class=solr.StopFilterFactory
ignoreCase=true words=stopwords.txt/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1   
catenateWords=1 catenateNumbers=1 catenateAll=0
splitOnCaseChange=1/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

The thing is that I'm trying to search against a field of this type (text) with 
a value like @gmail.com and I'm intended to get documents with that text, any 
advice?

slds
--
It is only in the mysterious equation of love that any 
logical reasons can be found.
Good programmers often confuse halloween (31 OCT) with 
christmas (25 DEC)



Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread richardg
I believe this is the same issue as described, I'm running 4.2 and as you can
see my slave is a couple versions ahead of the master (all three slaves show
the same behavior).  This was never the case until I upgraded from 4.0 to
4.2.

Master: 
1363272681951
93
1,022.31 MB
Slave:  
1363273274085
95
1,022.31 MB



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-1-monitoring-with-solr-replication-command-details-indexVersion-tp4047329p4047380.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about email search

2013-03-14 Thread Ahmet Arslan
Hi,

Since you have word delimiter filter in your analysis chain, I am not sure if 
e-mail addresses are recognised. You can check that on solr admin UI, analysis 
page. 

If e-mail addresses kept one token, I would use leading wildcard query.
q=*@gmail.com

There was a similar question recently: 
http://search-lucene.com/m/XF2ejnM6Vi2

--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote:

 From: Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
 Subject: Question about email search
 To: solr-user@lucene.apache.org
 Date: Thursday, March 14, 2013, 5:11 PM
 I'm using solr 3.6.2 to crawl some
 data using nutch, in my schema I've one field with all the
 content extracted from the page, which could possibly
 include email addresses, this is the configuration of my
 schema:
 
         fieldType name=text
 class=solr.TextField
            
 positionIncrementGap=100
 autoGeneratePhraseQueries=true
             analyzer
 type=index
                
 tokenizer class=solr.StandardTokenizerFactory/
                
 filter class=solr.StandardFilterFactory/
                
 filter class=solr.ISOLatin1AccentFilterFactory/
                
 filter class=solr.SnowballPorterFilterFactory
 languange=Spanish/
                
 charFilter class=solr.HTMLStripCharFilterFactory/
                
 filter class=solr.StopFilterFactory
                
     ignoreCase=true words=stopwords.txt/
                
 filter class=solr.WordDelimiterFilterFactory
                
     generateWordParts=1
 generateNumberParts=1   
                
     catenateWords=1 catenateNumbers=1
 catenateAll=0
                
     splitOnCaseChange=1/
                
 filter class=solr.LowerCaseFilterFactory/
                
 filter
 class=solr.RemoveDuplicatesTokenFilterFactory/
             /analyzer
         /fieldType
 
 The thing is that I'm trying to search against a field of
 this type (text) with a value like @gmail.com and I'm
 intended to get documents with that text, any advice?
 
 slds
 --
 It is only in the mysterious equation of love that any 
 logical reasons can be found.
 Good programmers often confuse halloween (31 OCT) with 
 christmas (25 DEC)
 



Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Hi

We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
But our application stopped working. When we tried 4.1 it was working as
expected.

Here is a  description of the situation.

We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
server. We added some classes to the standard Solr webapp which are
listening to a jms service and update the index according to the message
content, which can be fetch the document with this id from that URL and add
it to the index. The documents are fetched via SSL from a repository server.

This has been working well since Solr 1.2 for about 6 years now. With Solr
4.2 we suddenly get the following error:

javax.ejb.CreateException: Initialization failed for Singleton
IndexMessageClientFactory
at
com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
...
Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
initializing default system SSL context
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
at
org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
at
org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
at
org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
at
org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
at
org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
at
org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:155)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:132)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:101)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:93)
at
diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
at
com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
at
com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
at
com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
at
com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
at
com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
at
com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:538)
... 103 more
Caused by: java.io.IOException: Keystore was tampered with, or password was
incorrect
at
sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772)
at
sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55)
at java.security.KeyStore.load(KeyStore.java:1214)
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281)
at
org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:366)
... 134 more
Caused by: java.security.UnrecoverableKeyException: Password verification
failed
at
sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:770)


This exception occurs in this part

new 

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
What calls are you using to get the versions? Or is it the admin UI?

Also can you add any details about your setup - if this is a problem, we need 
to duplicate it in one of our unit tests.

Also, is it affecting proper replication in any way that you can tell.

- Mark

On Mar 14, 2013, at 11:12 AM, richardg richa...@dvdempire.com wrote:

 I believe this is the same issue as described, I'm running 4.2 and as you can
 see my slave is a couple versions ahead of the master (all three slaves show
 the same behavior).  This was never the case until I upgraded from 4.0 to
 4.2.
 
 Master:   
 1363272681951
 93
 1,022.31 MB
 Slave:
 1363273274085
 95
 1,022.31 MB
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-1-monitoring-with-solr-replication-command-details-indexVersion-tp4047329p4047380.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?

Just a guess.

The root cause looks to be:

 Caused by: java.io.IOException: Keystore was tampered with, or password was
 incorrect


- Mark

On Mar 14, 2013, at 11:24 AM, Uwe Klosa uwe.kl...@gmail.com wrote:

 Hi
 
 We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
 But our application stopped working. When we tried 4.1 it was working as
 expected.
 
 Here is a  description of the situation.
 
 We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
 server. We added some classes to the standard Solr webapp which are
 listening to a jms service and update the index according to the message
 content, which can be fetch the document with this id from that URL and add
 it to the index. The documents are fetched via SSL from a repository server.
 
 This has been working well since Solr 1.2 for about 6 years now. With Solr
 4.2 we suddenly get the following error:
 
 javax.ejb.CreateException: Initialization failed for Singleton
 IndexMessageClientFactory
at
 com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
 ...
 Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
 initializing default system SSL context
at
 org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
at
 org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
at
 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
at
 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
at
 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
at
 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
at
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
at
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
at
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
at
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:155)
at
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:132)
at
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:101)
at
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:93)
at
 diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
 com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
at
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
 com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
at
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
at
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
 com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
at
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
at
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
at
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
at
 com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:538)
... 103 more
 Caused by: java.io.IOException: Keystore was tampered with, or password was
 incorrect
at
 sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:772)
at
 sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:55)
at java.security.KeyStore.load(KeyStore.java:1214)
at
 org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:281)
at
 

need general advice on how others version and mange core deployments over time

2013-03-14 Thread geeky2
hello everyone,

i know this is a general topic - but would really appreciate info from
others that are doing this now.

  - how are others managing this so that users are impacted the least 
  - how are others handling the scenario where users don't want to migrate
forward.

thx
mark






--
View this message in context: 
http://lucene.472066.n3.nabble.com/need-general-advice-on-how-others-version-and-mange-core-deployments-over-time-tp4047390.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Thanks, but nobody has tempered with keystores. I have tested the
application on different machines. Always the same exception is thrown.

Do we have to set some system property to fix this?

/Uwe




On 14 March 2013 16:36, Mark Miller markrmil...@gmail.com wrote:

 Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?

 Just a guess.

 The root cause looks to be:

  Caused by: java.io.IOException: Keystore was tampered with, or password
 was
  incorrect


 - Mark

 On Mar 14, 2013, at 11:24 AM, Uwe Klosa uwe.kl...@gmail.com wrote:

  Hi
 
  We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2.
  But our application stopped working. When we tried 4.1 it was working as
  expected.
 
  Here is a  description of the situation.
 
  We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
  server. We added some classes to the standard Solr webapp which are
  listening to a jms service and update the index according to the message
  content, which can be fetch the document with this id from that URL and
 add
  it to the index. The documents are fetched via SSL from a repository
 server.
 
  This has been working well since Solr 1.2 for about 6 years now. With
 Solr
  4.2 we suddenly get the following error:
 
  javax.ejb.CreateException: Initialization failed for Singleton
  IndexMessageClientFactory
 at
 
 com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
  ...
  Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
  initializing default system SSL context
 at
 
 org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
 at
 
 org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
 at
 
 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
 at
 
 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
 at
 
 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
 at
 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:155)
 at
 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:132)
 at
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:101)
 at
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:93)
 at
  diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 
 com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
 at
 
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
 at
 
 com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
 at
 
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
 at
 
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 
 com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
 at
 
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
 at
 
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
 at
 
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
 at
 
 com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:538)
 ... 103 more
  Caused by: java.io.IOException: Keystore was tampered with, or 

Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Danzig, Scott
Hey all,

We're using a Solr 4 core to handle our article data.  When someone in our CMS 
publishes an article, we have a listener that indexes it straight to solr.  We 
use the previously instantiated HttpSolrServer, build the solr document, add it 
with server.add(doc) .. then do a server.commit() right away.  For some reason, 
sometimes this exception is thrown, which I suspect is related to a 
simultaneous data import done from another client which sometimes errors:

Feb 26, 2013 5:07:51 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1310)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1422)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1200)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
at 
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is 
closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:550)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:563)
at org.apache.lucene.index.IndexWriter.nrtIsCurrent(IndexWriter.java:4196)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:266)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
at 
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
at 
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1256)
... 28 more

I'm not sure if the error is causing the IndexWriter to close, and why an 
IndexWriter would be shared across clients, but usually, I can get around this 
by basically creating a new HttpSolrServer and trying again.  But it doesn't 
always work, perhaps due to frequency… I don't like the idea of an infinite 
loop of creating connections until it works.  I'd rather understand what's 
going on.  What's the proper way to fix this?  I see I can add a doc with a 
commitWithMs of 0 and maybe this couples the add tightly with the commit and 
would prevent interference.  But am I totally off the mark here as to the 
problem?  Suggestions?

Posted this on java-user before, but then realized solr-user existed, so please 
forgive the redundancy…

Thanks for reading!

- Scott

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
I found the answer myself. Thanks for the pointer.

Cheers
Uwe


On 14 March 2013 16:48, Uwe Klosa uwe.kl...@gmail.com wrote:

 Thanks, but nobody has tempered with keystores. I have tested the
 application on different machines. Always the same exception is thrown.

 Do we have to set some system property to fix this?

 /Uwe




 On 14 March 2013 16:36, Mark Miller markrmil...@gmail.com wrote:

 Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ?

 Just a guess.

 The root cause looks to be:

  Caused by: java.io.IOException: Keystore was tampered with, or password
 was
  incorrect


 - Mark

 On Mar 14, 2013, at 11:24 AM, Uwe Klosa uwe.kl...@gmail.com wrote:

  Hi
 
  We have been using Solr 4.0 for a while now and wanted to upgrade to
 4.2.
  But our application stopped working. When we tried 4.1 it was working as
  expected.
 
  Here is a  description of the situation.
 
  We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2
  server. We added some classes to the standard Solr webapp which are
  listening to a jms service and update the index according to the message
  content, which can be fetch the document with this id from that URL and
 add
  it to the index. The documents are fetched via SSL from a repository
 server.
 
  This has been working well since Solr 1.2 for about 6 years now. With
 Solr
  4.2 we suddenly get the following error:
 
  javax.ejb.CreateException: Initialization failed for Singleton
  IndexMessageClientFactory
 at
 
 com.sun.ejb.containers.AbstractSingletonContainer.createSingletonEJB(AbstractSingletonContainer.java:547)
  ...
  Caused by: org.apache.http.conn.ssl.SSLInitializationException: Failure
  initializing default system SSL context
 at
 
 org.apache.http.conn.ssl.SSLSocketFactory.createSystemSSLContext(SSLSocketFactory.java:368)
 at
 
 org.apache.http.conn.ssl.SSLSocketFactory.getSystemSocketFactory(SSLSocketFactory.java:204)
 at
 
 org.apache.http.impl.conn.SchemeRegistryFactory.createSystemDefault(SchemeRegistryFactory.java:82)
 at
 
 org.apache.http.impl.client.SystemDefaultHttpClient.createClientConnectionManager(SystemDefaultHttpClient.java:118)
 at
 
 org.apache.http.impl.client.AbstractHttpClient.getConnectionManager(AbstractHttpClient.java:466)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.setMaxConnections(HttpClientUtil.java:179)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientConfigurer.configure(HttpClientConfigurer.java:33)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.configureClient(HttpClientUtil.java:115)
 at
 
 org.apache.solr.client.solrj.impl.HttpClientUtil.createClient(HttpClientUtil.java:105)
 at
 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:155)
 at
 
 org.apache.solr.client.solrj.impl.HttpSolrServer.init(HttpSolrServer.java:132)
 at
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:101)
 at
 
 org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.init(ConcurrentUpdateSolrServer.java:93)
 at
 
 diva.commons.search.cdi.SolrServerFactory.init(SolrServerFactory.java:56)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 
 com.sun.ejb.containers.interceptors.BeanCallbackInterceptor.intercept(InterceptorManager.java:1009)
 at
 
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
 at
 
 com.sun.ejb.containers.interceptors.CallbackInvocationContext.proceed(CallbackInvocationContext.java:113)
 at
 
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.doCallback(SystemInterceptorProxy.java:138)
 at
 
 com.sun.ejb.containers.interceptors.SystemInterceptorProxy.init(SystemInterceptorProxy.java:120)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:601)
 at
 
 com.sun.ejb.containers.interceptors.CallbackInterceptor.intercept(InterceptorManager.java:964)
 at
 
 com.sun.ejb.containers.interceptors.CallbackChainImpl.invokeNext(CallbackChainImpl.java:65)
 at
 
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:393)
 at
 
 com.sun.ejb.containers.interceptors.InterceptorManager.intercept(InterceptorManager.java:376)
 at
 
 

Re: Replication

2013-03-14 Thread Timothy Potter
Hi Arkadi,

If the update delta between the shard leader and replica 100 docs, then
Solr punts and replicas the entire index. Last I heard, the 100 was
hard-coded in 4.0 so is not configurable. This makes sense because the
replica shouldn't be out-of-sync with the leader unless it has been offline.

Cheers,
Tim

On Thu, Mar 14, 2013 at 9:05 AM, Arkadi Colson ark...@smartbit.be wrote:

 Based on what does solr replicate the whole shard again from zero? From
 time to time after a restart of tomcat solr copies over the whole shard to
 the replicator instead of doing only the changes.

 BR,
 Arkadi



Out of Memory doing a query Solr 4.2

2013-03-14 Thread raulgrande83
Hi 

After doing a query to Solr to get the uniqueIds (string of 20 characters)
of 700 documents in a collection, I'm getting an out of memory error using
Solr 4.2. I tried to increase the JVM-Memory 1G (from 3G to 4G) however this
didn't change anything.

This was working on 3.5. 

I've moved from 3.5 to 4.2.

Did anyone have the same problem?

Thanks


--

Details :

Solr 4.2
Solr Index 20G aprox.

JVM: IBM J9 VM(1.6.0.2.4)
JVM-Memory:4G
S.O. Linux
Processors 8
RAM: 101G



org.apache.solr.common.SolrException log 
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError 
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:651)
 
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:364)
 
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:240)
 
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:164)
 
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164) 
at
org.apache.catalina.ha.session.JvmRouteBinderValve.invoke(JvmRouteBinderValve.java:218)
 
at
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:333) 
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) 
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:394) 
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:284)
 
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:322)
 
at
org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1714)
 
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:898)
 
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:920) 
at java.lang.Thread.run(Thread.java:736) 
Caused by: java.lang.OutOfMemoryError 
at java.util.Arrays.copyOfRange(Arrays.java:4114) 
at java.util.Arrays.copyOf(Arrays.java:3833) 
at java.lang.StringCoding.safeTrim(StringCoding.java:686) 
at java.lang.StringCoding.access$300(StringCoding.java:41) 
at
java.lang.StringCoding$StringDecoder.decode(StringCoding.java:739) 
at java.lang.StringCoding.decode(StringCoding.java:746) 
at java.lang.String.init(String.java:2036) 
at java.lang.String.init(String.java:2011) 
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(CompressingStoredFieldsReader.java:143)
 
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:272)
 
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:139) 
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:116)
 
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:436) 
at
org.apache.lucene.document.LazyDocument.getDocument(LazyDocument.java:65) 
at
org.apache.lucene.document.LazyDocument.access$000(LazyDocument.java:36) 
at
org.apache.lucene.document.LazyDocument$LazyField.stringValue(LazyDocument.java:105)
 
at org.apache.solr.schema.FieldType.toExternal(FieldType.java:346) 
at org.apache.solr.schema.FieldType.toObject(FieldType.java:355) 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.getValue(BinaryResponseWriter.java:208)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.getDoc(BinaryResponseWriter.java:186)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody(BinaryResponseWriter.java:147)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults(BinaryResponseWriter.java:173)
 
at
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve(BinaryResponseWriter.java:86)
 
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:154) 
at
org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:144) 
at
org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:234) 
at
org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149) 
at
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:92) 
at

ids request to shard with star query are slow

2013-03-14 Thread srinir
ids request to shard with star query are slow

I have  a distributed solr environment and I am investigating all the
request where the shard took significant amount of time. One common pattern
i saw was all the ids request with q=*:* and ids=some id took around
2-3sec. i picked some shard request q=xyz and ids=some id and all of them
took only few milliseconds.

I copied the params and manually sent the same request to that particular
shard and again it took around 2.5 sec. But when i removed the query (q=*:*)
parameter and sent the same set of params to the same shard i got the
response back in 10 or millisecond. in both cases the response had the
document i am looking for. 

took 2-3 sec
-
q=*:*
qt=search
ids=123
isShard=true

took 20ms
-
qt=search
ids=123
isShard=true

In my understanding ids param is used to get the stored field in a
distributed search. Why does the query parameter (q=) matter here ?

Thanks
Srini



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ids-request-to-shard-with-star-query-are-slow-tp4047395.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Strange error in Solr 4.2

2013-03-14 Thread Stefan Matheis


On Thursday, March 14, 2013 at 4:57 PM, Uwe Klosa wrote:

 I found the answer myself. Thanks for the pointer.


Would you mind sharing you answer, Uwe? 


Re: Out of Memory doing a query Solr 4.2

2013-03-14 Thread Robert Muir
On Thu, Mar 14, 2013 at 12:07 PM, raulgrande83 raulgrand...@hotmail.com wrote:
 JVM: IBM J9 VM(1.6.0.2.4)

I don't recommend using this JVM.


Re: Strange error in Solr 4.2

2013-03-14 Thread Shawn Heisey

On 3/14/2013 9:24 AM, Uwe Klosa wrote:

This exception occurs in this part

new ConcurrentUpdateSolrServer(http://solr.diva-portal.org:8080/search;,
5, 50)


Side comment, unrelated to your question:

If you're already aware that ConcurrentUpdateSolrServer has no built-in 
error handling and you're OK with that, then you don't need to be 
concerned with this message.


ConcurrentUpdateSolrServer swallows any exception that happens during 
its operation.  Errors get logged, but are not passed back to the 
calling application.  Update requests always succeed, even if Solr is 
completely down.


I have been told that it is possible to override the handleError method 
to fix this, but I don't know what code to actually use.


Thanks,
Shawn



Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller

On Mar 14, 2013, at 1:27 PM, Shawn Heisey s...@elyograg.org wrote:

 I have been told that it is possible to override the handleError method to 
 fix this

I'd say mitigate more than fix. I think the real fix requires some dev work. 

- Mark

Re: OutOfMemoryError

2013-03-14 Thread Shawn Heisey

On 3/14/2013 3:35 AM, Arkadi Colson wrote:

Hi

I'm getting this error after a few hours of filling solr with documents.
Tomcat is running with -Xms1024m -Xmx4096m.
Total memory of host is 12GB. Softcommits are done every second and hard
commits every minute.
Any idea why this is happening and how to avoid this?


*top*
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+ COMMAND
13666 root  20   0 86.8g 4.7g 248m S  101 39.7 478:37.45
/usr/bin/java
-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties 
-server
-Xms1024m -Xmx4096m -XX:PermSize=64m -XX:MaxPermSize=128m
-Duser.timezone=UTC -Dfile.encoding=UTF8 -Dsolr.solr.home=/opt/solr/
-Dport=8983 -Dcollection.configName
22247 root  20   0 2430m 409m 4176 S0  3.4   1:23.43 java
-Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
/opt/zookeeper/bin/../build/classes:/opt/zookeeper/bin/../build/lib/*.jar:/opt/zookeeper/bi



*free -m**
* total   used   free shared buffers cached
Mem: 12047  11942105  0180 6363
-/+ buffers/cache:   5399   6648
Swap:  956 75881


As you've already been told, this looks like you have about 80GB of 
index.  I ran into Out Of Memory problems with heavy indexing with a 4GB 
heap on a total index size just a little bit smaller than this.  I had 
to increase the heap size to 8GB.


With heap sizes this large, you'll see garbage collection pause problems 
without careful tuning.  You're probably already having these problems 
with the 4GB heap, but they'll get much worse with an 8GB heap.  Here 
are the memory options I'm using that got rid of my GC pause problem. 
I'm using these with with the Sun/Oracle JVM, on both 1.6 and 1.7:


-Xmx8192M -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 
-XX:NewRatio=3 -XX:MaxTenuringThreshold=8 -XX:+CMSParallelRemarkEnabled 
-XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts


I notice that you've got options that change the PermSize and 
MaxPermSize.  You probably don't need these options, unless you know 
that you'll run into problems without it.


Additional note: if you have greatly increased RamBufferSizeMB, try 
reducing it to 100, the default on recent versions.  The default used to 
be 32.  Either amount is usually plenty, unless you have huge documents.


Side comment: 12GB total RAM isn't going to be enough memory for top 
performance with 80GB of index.  You'll probably need 8GB of java heap, 
plus between 40 and 80GB of memory for the OS disk cache, to fit a large 
chunk (or all) of your index into RAM.  48GB would be a good start, 64 
to 128GB would be better.


Thanks,
Shawn



Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Hi everyone,

Is there an official definition of the Current flag under Core 
Home  Statistics?

What would it mean if a shard leader is not Current?

Thanks,

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
Hi, 

I think that in solr 4.2 the new feature to proxy a request if the
collection is not in the requested node has a bug.

If I do a query with the parameter rows=0 and the node doesn't have the
collection. If the parameter is rows=4 or superior then the search works as
expected 

the curl returns 

The output of wget is:

Connecting to 192.168.20.48:8983... connected.
HTTP request sent, awaiting response... 200 OK
Length: 210 [application/xml]
Saving to: ‘select?q=*:*rows=0’

 0% [   
  
] 0   --.-K/s   in 0s

2013-03-14 18:01:04 (0.00 B/s) - Connection closed at byte 0. Retrying.

Curl says:

curl 
http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*rows=0;
curl: (56) Problem (2) in the Chunked-Encoded data

Chrome says:

This webpage is not available
The webpage at
http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*rows=0wt=xmlindent=true
might be temporarily down or it may have moved permanently to a new web
address.
Error 321 (net::ERR_INVALID_CHUNKED_ENCODING): Unknown error.

Someone have the same issue?





-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread Mark Miller
I'll add a test with rows = 0 and see how easy it is to replicate.

Looks to me like you should file a JIRA issue in any case.

- Mark

On Mar 14, 2013, at 2:04 PM, yriveiro yago.rive...@gmail.com wrote:

 Hi, 
 
 I think that in solr 4.2 the new feature to proxy a request if the
 collection is not in the requested node has a bug.
 
 If I do a query with the parameter rows=0 and the node doesn't have the
 collection. If the parameter is rows=4 or superior then the search works as
 expected 
 
 the curl returns 
 
 The output of wget is:
 
 Connecting to 192.168.20.48:8983... connected.
 HTTP request sent, awaiting response... 200 OK
 Length: 210 [application/xml]
 Saving to: ‘select?q=*:*rows=0’
 
 0% [  

 ] 0   --.-K/s   in 0s
 
 2013-03-14 18:01:04 (0.00 B/s) - Connection closed at byte 0. Retrying.
 
 Curl says:
 
 curl 
 http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*rows=0;
 curl: (56) Problem (2) in the Chunked-Encoded data
 
 Chrome says:
 
 This webpage is not available
 The webpage at
 http://192.168.20.48:8983/solr/ST-3A856BBCA3_12/select?q=*%3A*rows=0wt=xmlindent=true
 might be temporarily down or it may have moved permanently to a new web
 address.
 Error 321 (net::ERR_INVALID_CHUNKED_ENCODING): Unknown error.
 
 Someone have the same issue?
 
 
 
 
 
 -
 Best regards
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
The log of the UI 

null:org.apache.solr.common.SolrException: Error trying to proxy request for
url: http://192.168.20.47:8983/solr/ST-3A856BBCA3_12/select

I will open the issue in Jira.

Thanks



-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-2-mechanism-proxy-request-error-tp4047433p4047440.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-14 Thread Chris Hostetter

: It looks strange to me that if there is no document yet (foundVersion  0)
: then the only case when document will be imported is when input version is
: negative. Guess I need to test specific cases using SolrJ or smth. to be sure.

you're assuming that if foundVersion  0 that means no document *yet* ... 
it could also mean there was a document, and it's been deleted.

Either way if the client has said (replace|update) version X of doc D 
the code is failing because it can't: doc D does not exist with version 
X.  Regardless of whether someone deleted doc D, or replaced it it with a 
newer version, or it never existed i nthe first place, Solr can't do what 
you asked it to do.

: Anyway I'll also check if I can inherit from SolrEntityProcessor and override
: _version_ field there before insertion.

Easier solutions to consider (off the cuff, not tested)...

1) on in your SolrEntityProcessor, configure fl with something like this 
to alias the _version_ field to something else

   fl=*,old_version:_version_

2) configure your destination solr instance with an update chain that 
ignores the _version_ field (you wouldn't want this for most normal usage, 
but it would be suitable for thiese conds of from scratch imports from 
other solr instances)...

https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html



-Hoss


Re: Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
Sorry for the duplicated mail :-(, any advice on a configuration for searching 
emails in a field that does not have only email addresses, so the email 
addresses are contained in larger textual messages?

- Mensaje original -
De: Ahmet Arslan iori...@yahoo.com
Para: solr-user@lucene.apache.org
Enviados: Jueves, 14 de Marzo 2013 11:23:47
Asunto: Re: Question about email search

Hi,

Since you have word delimiter filter in your analysis chain, I am not sure if 
e-mail addresses are recognised. You can check that on solr admin UI, analysis 
page.

If e-mail addresses kept one token, I would use leading wildcard query.
q=*@gmail.com

There was a similar question recently:
http://search-lucene.com/m/XF2ejnM6Vi2

--- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote:

 From: Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
 Subject: Question about email search
 To: solr-user@lucene.apache.org
 Date: Thursday, March 14, 2013, 5:11 PM
 I'm using solr 3.6.2 to crawl some
 data using nutch, in my schema I've one field with all the
 content extracted from the page, which could possibly
 include email addresses, this is the configuration of my
 schema:

         fieldType name=text
 class=solr.TextField
            
 positionIncrementGap=100
 autoGeneratePhraseQueries=true
             analyzer
 type=index
                
 tokenizer class=solr.StandardTokenizerFactory/
                
 filter class=solr.StandardFilterFactory/
                
 filter class=solr.ISOLatin1AccentFilterFactory/
                
 filter class=solr.SnowballPorterFilterFactory
 languange=Spanish/
                
 charFilter class=solr.HTMLStripCharFilterFactory/
                
 filter class=solr.StopFilterFactory
                
     ignoreCase=true words=stopwords.txt/
                
 filter class=solr.WordDelimiterFilterFactory
                
     generateWordParts=1
 generateNumberParts=1   
                
     catenateWords=1 catenateNumbers=1
 catenateAll=0
                
     splitOnCaseChange=1/
                
 filter class=solr.LowerCaseFilterFactory/
                
 filter
 class=solr.RemoveDuplicatesTokenFilterFactory/
             /analyzer
         /fieldType

 The thing is that I'm trying to search against a field of
 this type (text) with a value like @gmail.com and I'm
 intended to get documents with that text, any advice?

 slds
 --
 It is only in the mysterious equation of love that any
 logical reasons can be found.
 Good programmers often confuse halloween (31 OCT) with
 christmas (25 DEC)




Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I've been looking all over for a clear answer to this question and can't seem
to find one. It seems like a very basic concept to me though so maybe I'm
using the wrong terminology.  I want to be able to search across multiple
collections (as it is now called in SolrCloud world, previously called
Cores).  I want the scoring, sorting, faceting etc. to be blended, that is
to be relevant to data from all the collections, not just a set of
independent results per collection.  Is that possible?

A real-world example would be a merchandise site that has books, movies and
music. The index for each of those is quite different and they would have
their own schema.xml (and therefore be their own Collection). When in the
'books' area of a website the users could search on fields specific to books
(ISBN for example). However on a 'home' page a search would span across all
3 product lines, and the results should be scored relative to each other,
not just relative to other items in their specific collection. 

Is this possible in v4.0? I'm pretty sure it wasn't in v1.4.1. But it seems
to be a fundamentally useful concept, I was wondering if it had been
addressed yet.
Thanks,
Ken



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searching-across-multiple-collections-cores-tp4047457.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching across multiple collections (cores)

2013-03-14 Thread Mark Miller
Yes, with SolrCloud, it's just the collection param (as long as the schemas are 
compatible for this):

http://wiki.apache.org/solr/SolrCloud#Distributed_Requests

- Mark

On Mar 14, 2013, at 2:55 PM, kfdroid kfdr...@gmail.com wrote:

 I've been looking all over for a clear answer to this question and can't seem
 to find one. It seems like a very basic concept to me though so maybe I'm
 using the wrong terminology.  I want to be able to search across multiple
 collections (as it is now called in SolrCloud world, previously called
 Cores).  I want the scoring, sorting, faceting etc. to be blended, that is
 to be relevant to data from all the collections, not just a set of
 independent results per collection.  Is that possible?
 
 A real-world example would be a merchandise site that has books, movies and
 music. The index for each of those is quite different and they would have
 their own schema.xml (and therefore be their own Collection). When in the
 'books' area of a website the users could search on fields specific to books
 (ISBN for example). However on a 'home' page a search would span across all
 3 product lines, and the results should be scored relative to each other,
 not just relative to other items in their specific collection. 
 
 Is this possible in v4.0? I'm pretty sure it wasn't in v1.4.1. But it seems
 to be a fundamentally useful concept, I was wondering if it had been
 addressed yet.
 Thanks,
 Ken
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Searching-across-multiple-collections-cores-tp4047457.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Hey Michael

I was a bit confused because you mentioned SolrCloud in the subject. We're 
talking about http://host:port/solr/#/collection1 (f.e.) right? And there, the 
left-upper Box Statistics ?

If so, the Output comes from /solr/collection1/admin/luke ( 
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
 ) which uses DirectoryReader.isCurrent() under the Hood.

That method contains a explanation in its javadocs: 
http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()

HTH
Stefan



On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:

 Hi everyone,
  
 Is there an official definition of the Current flag under Core 
 Home  Statistics?
  
 What would it mean if a shard leader is not Current?
  
 Thanks,
  
 Michael Della Bitta
  
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
  
 www.appinions.com (http://www.appinions.com)
  
 Where Influence Isn’t a Game  




Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Stefan,

Thanks a lot! Makes sense. So I don't have to worry about my leader
thinking it's out of date, then.

Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
matheis.ste...@gmail.com wrote:
 Hey Michael

 I was a bit confused because you mentioned SolrCloud in the subject. We're 
 talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
 the left-upper Box Statistics ?

 If so, the Output comes from /solr/collection1/admin/luke ( 
 http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
  ) which uses DirectoryReader.isCurrent() under the Hood.

 That method contains a explanation in its javadocs: 
 http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()

 HTH
 Stefan



 On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:

 Hi everyone,

 Is there an official definition of the Current flag under Core 
 Home  Statistics?

 What would it mean if a shard leader is not Current?

 Thanks,

 Michael Della Bitta

 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271

 www.appinions.com (http://www.appinions.com)

 Where Influence Isn’t a Game




Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Perhaps the wording of Current is a bit too generic in that context? I'd like 
to change that description if that clarifies things .. but not sure which one 
is a better fit?



On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote:

 Stefan,
  
 Thanks a lot! Makes sense. So I don't have to worry about my leader
 thinking it's out of date, then.
  
 Michael Della Bitta
  
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
  
 www.appinions.com (http://www.appinions.com)
  
 Where Influence Isn’t a Game
  
  
 On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
 matheis.ste...@gmail.com (mailto:matheis.ste...@gmail.com) wrote:
  Hey Michael
   
  I was a bit confused because you mentioned SolrCloud in the subject. We're 
  talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
  the left-upper Box Statistics ?
   
  If so, the Output comes from /solr/collection1/admin/luke ( 
  http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
   ) which uses DirectoryReader.isCurrent() under the Hood.
   
  That method contains a explanation in its javadocs: 
  http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()
   
  HTH
  Stefan
   
   
   
  On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:
   
   Hi everyone,

   Is there an official definition of the Current flag under Core 
   Home  Statistics?

   What would it mean if a shard leader is not Current?

   Thanks,

   Michael Della Bitta

   
   Appinions
   18 East 41st Street, 2nd Floor
   New York, NY 10017-6271

   www.appinions.com (http://www.appinions.com)

   Where Influence Isn’t a Game  




Re: Meaning of Current in Solr Cloud Statistics

2013-03-14 Thread Mark Miller
Something like 'Reader is Current' might be better. Personally, I don't even 
know if it's worth showing.

- Mark

On Mar 14, 2013, at 3:40 PM, Stefan Matheis matheis.ste...@gmail.com wrote:

 Perhaps the wording of Current is a bit too generic in that context? I'd 
 like to change that description if that clarifies things .. but not sure 
 which one is a better fit?
 
 
 
 On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote:
 
 Stefan,
 
 Thanks a lot! Makes sense. So I don't have to worry about my leader
 thinking it's out of date, then.
 
 Michael Della Bitta
 
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
 
 www.appinions.com (http://www.appinions.com)
 
 Where Influence Isn’t a Game
 
 
 On Thu, Mar 14, 2013 at 3:11 PM, Stefan Matheis
 matheis.ste...@gmail.com (mailto:matheis.ste...@gmail.com) wrote:
 Hey Michael
 
 I was a bit confused because you mentioned SolrCloud in the subject. We're 
 talking about http://host:port/solr/#/collection1 (f.e.) right? And there, 
 the left-upper Box Statistics ?
 
 If so, the Output comes from /solr/collection1/admin/luke ( 
 http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/admin/LukeRequestHandler.java?view=markup#l551
  ) which uses DirectoryReader.isCurrent() under the Hood.
 
 That method contains a explanation in its javadocs: 
 http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/DirectoryReader.html#isCurrent()
 
 HTH
 Stefan
 
 
 
 On Thursday, March 14, 2013 at 7:01 PM, Michael Della Bitta wrote:
 
 Hi everyone,
 
 Is there an official definition of the Current flag under Core 
 Home  Statistics?
 
 What would it mean if a shard leader is not Current?
 
 Thanks,
 
 Michael Della Bitta
 
 
 Appinions
 18 East 41st Street, 2nd Floor
 New York, NY 10017-6271
 
 www.appinions.com (http://www.appinions.com)
 
 Where Influence Isn’t a Game  
 
 



Solr indexing binary files

2013-03-14 Thread Luis
Hi, I am new with Solr and I am extracting metadata from binary files through
URLs stored in my database.  I would like to know what fields are available
for indexing from PDFs (the ones that would be initiated as in column=””). 
For example how would I extract something like file size, format or file
type.  

I would also like to know how to create customized fields in Solr.  How
those metadata and text content are mapped into Solr schema?  Would I have
to declare that in the solrconfig.xml or do some more tweaking somewhere
else?  If someone has a code snippet that could show me it would be greatly
appreciated.

Thank you in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr indexing binary files

2013-03-14 Thread Jack Krupansky

Take a look at Solr Cell:

http://wiki.apache.org/solr/ExtractingRequestHandler

Include a dynamicField with a * pattern and you will see the wide variety 
of metadata that is available for PDF and other rich document formats.


-- Jack Krupansky

-Original Message- 
From: Luis

Sent: Thursday, March 14, 2013 3:30 PM
To: solr-user@lucene.apache.org
Subject: Solr indexing binary files

Hi, I am new with Solr and I am extracting metadata from binary files 
through

URLs stored in my database.  I would like to know what fields are available
for indexing from PDFs (the ones that would be initiated as in column=””).
For example how would I extract something like file size, format or file
type.

I would also like to know how to create customized fields in Solr.  How
those metadata and text content are mapped into Solr schema?  Would I have
to declare that in the solrconfig.xml or do some more tweaking somewhere
else?  If someone has a code snippet that could show me it would be greatly
appreciated.

Thank you in advance.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Question about email search

2013-03-14 Thread Alexandre Rafalovitch
Sure. copyField it into a new indexed non-stored field with the following
type definition:
fieldType name=address_email class=solr.TextField
  analyzer
tokenizer class=solr.UAX29URLEmailTokenizerFactory/
filter class=solr.TypeTokenFilterFactory
types=filter_email.txt enablePositionIncrements=true
useWhitelist=true/
  /analyzer
/fieldType

Content of filter_email.txt is (including  signs):
EMAIL

You will have the emails only left as tokens. Can't display them easily,
but can certainly search.
Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Mar 14, 2013 at 2:33 PM, Jorge Luis Betancourt Gonzalez 
jlbetanco...@uci.cu wrote:

 Sorry for the duplicated mail :-(, any advice on a configuration for
 searching emails in a field that does not have only email addresses, so the
 email addresses are contained in larger textual messages?

 - Mensaje original -
 De: Ahmet Arslan iori...@yahoo.com
 Para: solr-user@lucene.apache.org
 Enviados: Jueves, 14 de Marzo 2013 11:23:47
 Asunto: Re: Question about email search

 Hi,

 Since you have word delimiter filter in your analysis chain, I am not sure
 if e-mail addresses are recognised. You can check that on solr admin UI,
 analysis page.

 If e-mail addresses kept one token, I would use leading wildcard query.
 q=*@gmail.com

 There was a similar question recently:
 http://search-lucene.com/m/XF2ejnM6Vi2

 --- On Thu, 3/14/13, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
 wrote:

  From: Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu
  Subject: Question about email search
  To: solr-user@lucene.apache.org
  Date: Thursday, March 14, 2013, 5:11 PM
  I'm using solr 3.6.2 to crawl some
  data using nutch, in my schema I've one field with all the
  content extracted from the page, which could possibly
  include email addresses, this is the configuration of my
  schema:
 
  fieldType name=text
  class=solr.TextField
 
  positionIncrementGap=100
  autoGeneratePhraseQueries=true
  analyzer
  type=index
 
  tokenizer class=solr.StandardTokenizerFactory/
 
  filter class=solr.StandardFilterFactory/
 
  filter class=solr.ISOLatin1AccentFilterFactory/
 
  filter class=solr.SnowballPorterFilterFactory
  languange=Spanish/
 
  charFilter class=solr.HTMLStripCharFilterFactory/
 
  filter class=solr.StopFilterFactory
 
  ignoreCase=true words=stopwords.txt/
 
  filter class=solr.WordDelimiterFilterFactory
 
  generateWordParts=1
  generateNumberParts=1
 
  catenateWords=1 catenateNumbers=1
  catenateAll=0
 
  splitOnCaseChange=1/
 
  filter class=solr.LowerCaseFilterFactory/
 
  filter
  class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
  /fieldType
 
  The thing is that I'm trying to search against a field of
  this type (text) with a value like @gmail.com and I'm
  intended to get documents with that text, any advice?
 
  slds
  --
  It is only in the mysterious equation of love that any
  logical reasons can be found.
  Good programmers often confuse halloween (31 OCT) with
  christmas (25 DEC)
 
 



Re: Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Otis Gospodnetic
Hi Scott,

Not sure why IW would be closed, but:
* consider not (hard) committing after each doc, but just periodically,
every N minutes
* soft committing instead
* using 4.2

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Mar 14, 2013 at 11:55 AM, Danzig, Scott scott.dan...@nymag.comwrote:

 Hey all,

 We're using a Solr 4 core to handle our article data.  When someone in our
 CMS publishes an article, we have a listener that indexes it straight to
 solr.  We use the previously instantiated HttpSolrServer, build the solr
 document, add it with server.add(doc) .. then do a server.commit() right
 away.  For some reason, sometimes this exception is thrown, which I suspect
 is related to a simultaneous data import done from another client which
 sometimes errors:

 Feb 26, 2013 5:07:51 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException: Error opening new
 searcher
 at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1310)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1422)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1200)
 at
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:560)
 at
 org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:87)
 at
 org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64)
 at
 org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1007)
 at
 org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157)
 at
 org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:225)
 at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168)
 at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
 at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927)
 at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
 at
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:999)
 at
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:565)
 at
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:309)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)
 Caused by: org.apache.lucene.store.AlreadyClosedException: this
 IndexWriter is closed
 at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:550)
 at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:563)
 at org.apache.lucene.index.IndexWriter.nrtIsCurrent(IndexWriter.java:4196)
 at
 org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:266)
 at
 org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:245)
 at
 org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235)
 at
 org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169)
 at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1256)
 ... 28 more

 I'm not sure if the error is causing the IndexWriter to close, and why an
 IndexWriter would be shared across clients, but usually, I can get around
 this by basically creating a new HttpSolrServer and trying again.  But it
 doesn't always work, perhaps due to frequency… I don't like the idea of an
 infinite loop of creating connections until it works.  I'd rather
 understand what's going on.  What's the proper way to fix this?  I see I
 can add a doc with a commitWithMs of 0 and maybe this couples the add
 tightly with the commit and would prevent interference.  But am I totally
 off the mark here as to the problem?  Suggestions?

 Posted this on java-user before, but then realized solr-user existed, so
 please forgive the 

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Lance Norskog
Wow! That's great. And it's a lot of work, especially getting it all 
keyboard-complete. Thank you.


On 03/14/2013 01:29 AM, Chantal Ackermann wrote:

Hi all,


this is not a question. I just wanted to announce that I've written a blog post 
on how to set up Maven for packaging and automatic testing of a SOLR index 
configuration.

http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/

Feedback or comments appreciated!
And again, thanks for that great piece of software.

Chantal





Re: Advice: solrCloud + DIH

2013-03-14 Thread rulinma
3docs/s is lower, I test with 4 node is more 1000docs/s and 4k/doc with
solrcloud. Every leader has a replica.

I am tuning to improve to 3000docs/s. 3docs/s is too slow.

3x!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047559.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Embedded Solr

2013-03-14 Thread rulinma
give u to test embeded solr:

import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.util.ArrayList;
import java.util.Collection;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.core.CoreContainer;

public class EmbededSolrTest {

private static int commitNum = 5000;

private static String path =
/home/solr/Rollin/solr-4.1.0/embeddedExample;

/**
 * @param args
 * @throws Exception
 */
public static void main(String[] args) throws Exception {
if(args != null) {
if(args.length  0) {
path = args[0].trim();
}
if(args.length  1) {
commitNum = Integer.parseInt(args[1].trim());
}
}
//path = D:\\program\\solr\\41embededtest;
System.setProperty(solr.solr.home, path);
CoreContainer.Initializer initializer = new 
CoreContainer.Initializer();
CoreContainer coreContainer = initializer.initialize();
EmbeddedSolrServer server = new 
EmbeddedSolrServer(coreContainer, );
addIndex(server);
//query(server);
//deleteAllDoc(server);
}

public static void query(SolrServer server) throws Exception {
try {
SolrQuery q = new SolrQuery();
q.setQuery(*:*);
q.setStart(0);
q.setRows(20);
SolrDocumentList list = server.query(q).getResults();
System.out.println(list.getNumFound());
} catch(Exception e) {
e.printStackTrace();
} finally {
server.shutdown();
}
}

public static void deleteAllDoc(SolrServer server) throws Exception {
try {
server.deleteByQuery(*:*);
server.commit();
query(server);
} catch(Exception e) {
e.printStackTrace();
} finally {
server.shutdown();
}
}

public static void addIndex(SolrServer solrServer) throws IOException,
ParseException {

String path = index;
Analyzer analyzer = new SimpleAnalyzer(Version.LUCENE_35);
//Analyzer analyzer = new SimpleAnalyzer();
Directory directonry = FSDirectory.open(new File(path));
IndexReader ireader = IndexReader.open(directonry);
IndexSearcher isearcher = new IndexSearcher(ireader);
QueryParser parser = new QueryParser(Version.LUCENE_35, , 
analyzer);
Query query = parser.parse(*:*);
TopDocs hits = isearcher.search(query, null, 100);
System.out.println(find size:  + hits.totalHits);
java.net.InetAddress addr = java.net.InetAddress.getLocalHost();
String computerName = addr.getHostName();
//insert2Solr(solrServer, isearcher, hits);
long beginTime = System.currentTimeMillis();
long totalTime = 0;
System.out.println(begin time:  + beginTime);
try {
CollectionSolrInputDocument docs = new 
ArrayListSolrInputDocument();
for(int i = 0; i  hits.scoreDocs.length; i ++ ) {
SolrInputDocument doc = new SolrInputDocument();
Document hitDoc = 
isearcher.doc(hits.scoreDocs[i].doc);
doc.addField(id, i + a + computerName +
Thread.currentThread().getId());


doc.addField(text, hitDoc.get(text));
docs.add(doc);
   

Re: discovery-based core enumeration with embedded solr

2013-03-14 Thread Erick Erickson
H, could you raise a JIRA and assign it to me? Please be sure and
emphasize that it's embedded because I'm pretty sure this is fine for the
regular case.

But I have to admit that the embedded case completely slipped under the
radar.

Even better if you could make a test case, but that might not be
straightforward...

Thanks,
Erick


On Wed, Mar 13, 2013 at 5:28 PM, Michael Sokolov 
msoko...@safaribooksonline.com wrote:

 Has the new core enumeration strategy been implemented in the
 CoreContainer.Initializer.**initialize() code path?  It doesn't seem like
 it has.

 I get this exception:

 Caused by: org.apache.solr.common.**SolrException: Could not load config
 for solrconfig.xml
 at org.apache.solr.core.**CoreContainer.createFromLocal(**
 CoreContainer.java:991)
 at org.apache.solr.core.**CoreContainer.create(**
 CoreContainer.java:1051)
 ... 10 more
 Caused by: java.io.IOException: Can't find resource 'solrconfig.xml' in
 classpath or 'solr-multi/collection1/conf/'**, cwd=/proj/lux
 at org.apache.solr.core.**SolrResourceLoader.**openResource(**
 SolrResourceLoader.java:318)
 at org.apache.solr.core.**SolrResourceLoader.openConfig(**
 SolrResourceLoader.java:283)
 at org.apache.solr.core.Config.**init(Config.java:103)
 at org.apache.solr.core.Config.**init(Config.java:73)
 at org.apache.solr.core.**SolrConfig.init(SolrConfig.**java:117)
 at org.apache.solr.core.**CoreContainer.createFromLocal(**
 CoreContainer.java:989)
 ... 11 more

 even though I have a solr.properties file in solr-multi (which is my
 solr.home), and core.properties in some subdirectories of that

 --
 Michael Sokolov
 Senior Architect
 Safari Books Online




Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-14 Thread Felipe Lahti
Hi!

Take a look on
http://wiki.apache.org/solr/SchemaXml#Common_field_options
parameter *omitTermFreqAndPositions*

or you can use a custom similarity class that overrides the term freq and
return one for only that field.
http://wiki.apache.org/solr/SchemaXml#Similarity

  fieldType name=text_dfr class=solr.TextField
analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer/
similarity class=solr.MyCustomSimiliratyWithoutTermFreq
/similarity
  /fieldType


Best,

On Wed, Mar 13, 2013 at 8:43 PM, roz dev rozde...@gmail.com wrote:

 Hi All

 I am wondering if there is a way to alter term frequency of a certain field
 as 1, even if there are multiple matches in that document?

 Use Case is:

 Let's say that I have a document with 2 fields

 - Name and
 - Description

 And, there is a document with data like this

 Document_1
 Name = Blue Jeans
 Description = This jeans is very soft.  Jeans is pretty nice.

 Now, If I Search for Jeans then Jeans is found in 2 places in
 Description field.

 Term Frequency for Description is 2

 I want Solr to count term frequency for Description as 1 even if Jeans is
 found multiple times in this field.

 For all other fields, i do want to get the term frequency, as it is.

 Is this doable in Solr with any of the functions?

 Any inputs are welcome.

 Thanks
 Saroj




-- 
Felipe Lahti
Consultant Developer - ThoughtWorks Porto Alegre


SOLR Num Docs vs NumFound

2013-03-14 Thread Nathan Findley
On my solr 4 setup a query returns a higher NumFound value during a 
*:* query than the Num Docs value reported on the statistics page of 
collection1. Why is that? My data is split across 3 data import handlers 
where each handler has the same type of data but the ids are guaranteed 
to be different.


Are some of my documents not hard commited? If so, how do I hard commit. 
Otherwise, why are these numbers different?


--
CTO
Zenlok株式会社



Re: Solr Replication

2013-03-14 Thread vicky desai
Hi,

I have a multi core setup and there is continuous updation going on in each
core. Hence I dont prefer a bckup as it would either cause a downtime or if
during a backup there is a write activity my backup will be corrupted. Can
you please suggest if there is a cleaner way to handle this



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Replication-tp4047266p4047591.html
Sent from the Solr - User mailing list archive at Nabble.com.