Alternative to waitFlush in Solr4.0 !!!?

2012-06-27 Thread stockii
exists an alternative to waitFlush?

in my setup this command is very usefull for my NRT. is nobody here with the
same problem?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Alternative-to-waitFlush-in-Solr4-0-tp3991489.html
Sent from the Solr - User mailing list archive at Nabble.com.


LeaderElection bugfix

2012-06-27 Thread Trym R. Møller

Hi

In Solr Cloud when a Solr looses its ZooKeeper connection e.g. because 
of a session timeout the LeaderElector ZooKeeper Watchers handling its 
replica slices are notified with two events:
a Disconnected and a SyncConnected event. Currently the 
org.apache.solr.cloud.LeaderElector#checkIfIamLeader code does two bad 
things when this happens:
1. On the disconnect event it fails with a session timeout when 
talking to ZooKeeper (it has no zookeeper connection at this point)
2. On the syncConnected event it adds a new watcher to the ZooKeeper 
leader election leader node.
As documented in the zookeeper programming guide, the watchers are not 
removed when a zookeeper connection is lost (even though the watchers 
are notified twice),
so for each zookeeper connection loss the number of watchers are 
doubled. It can be noted that there are two watchers per replica slice 
(the overseer and collection/slice/election).


A fix for this could be
org.apache.solr.cloud.LeaderElector#checkIfIamLeader
...
new Watcher() {

  @Override
  public void process(WatchedEvent event) {
log.debug(seq +  watcher received event:  + event);
// Reconnect should not add new watchers as the old 
watchers are still available!

if (EventType.None.equals(event.getType())) {
  log.debug(Skipping event:  + event);
  return;
}

The behaviour of this can be verified using the below test in the 
org.apache.solr.cloud.LeaderElectionIntegrationTest

Can someone confirm this and add it to svn?

Thanks in advance.

Best regards Trym

  @Test
  public void testReplicaZookeeperConnectionLoss() throws Exception {
  // who is the leader?
  String leader = getLeader();

  SetInteger shard1Ports = shardPorts.get(shard1);

  int leaderPort = getLeaderPort(leader);
  assertTrue(shard1Ports.toString(), shard1Ports.contains(leaderPort));

  // timeout a replica a couple of times
  System.setProperty(zkClientTimeout, 500);
  int replicaPort = 7001;
  if (leaderPort == 7001) {
replicaPort = 7000;
  }
assertNotSame(containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper(), 
containerMap.get(leaderPort).getZkController().getZkClient().getSolrZooKeeper());

containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
  Thread.sleep(10 * 1000);
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
  Thread.sleep(10 * 1000);
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);
  Thread.sleep(10 * 1000);

  // kill the leader
  if (VERBOSE) System.out.println(Killing  + leaderPort);
  shard1Ports.remove(leaderPort);
  containerMap.get(leaderPort).shutdown();

  // poll until leader change is visible
  for (int j = 0; j  90; j++) {
String currentLeader = getLeader();
if(!leader.equals(currentLeader)) {
  break;
}
Thread.sleep(500);
  }

  leader = getLeader();
  int newLeaderPort = getLeaderPort(leader);
  int retry = 0;
  while (leaderPort == newLeaderPort) {
if (retry++ == 20) {
  break;
}
Thread.sleep(1000);
  }

  if (leaderPort == newLeaderPort) {
fail(We didn't find a new leader!  + leaderPort +  was 
shutdown, but it's still showing as the leader);

  }

  assertTrue(Could not find leader  + newLeaderPort +  in  + 
shard1Ports, shard1Ports.contains(newLeaderPort));

  }



Re: LeaderElection bugfix

2012-06-27 Thread Sami Siren
On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk wrote:
 Hi

Hi,

 The behaviour of this can be verified using the below test in the
 org.apache.solr.cloud.LeaderElectionIntegrationTest

Can you reproduce the failure in your test every time or just rarely?
I added the test method to LeaderElectionIntegrationTest and ran it
few times but I can't get it to fail.

--
 Sami Siren


Re: LeaderElection bugfix

2012-06-27 Thread Trym R. Møller

Hi Sami

Thanks for your rapid reply.

Regarding 1) This seems to be time dependent but it is seen on my local 
windows running the unit test and on a linux server running Solr.
Regarding 2) The test does not show the number of Watchers are 
increasing, but this can be observed either by dumping the memory from 
the jvm or by looking at the debug statements (if debug is enabled).


I don't know how to make assert statements regarding the number of 
watchers in zookeeper, so the test is not quite informative, but more 
confirming that the fix doesn't destroy anything.


Best regards Trym

Den 27-06-2012 10:06, Sami Siren skrev:

On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk wrote:

Hi

Hi,


The behaviour of this can be verified using the below test in the
org.apache.solr.cloud.LeaderElectionIntegrationTest

Can you reproduce the failure in your test every time or just rarely?
I added the test method to LeaderElectionIntegrationTest and ran it
few times but I can't get it to fail.

--
  Sami Siren





Re: LeaderElection bugfix

2012-06-27 Thread Trym R. Møller

Hi Sami

Regarding 2) A simple way to inspect the number of watchers, is to add 
an error log statement to the process method of the watcher

  public void process(WatchedEvent event) {
log.error(seq +  watcher received event:  + event);
and see that the number of logs doubles for each call to
containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);

Best regards Trym

Den 27-06-2012 10:14, Trym R. Møller skrev:

Hi Sami

Thanks for your rapid reply.

Regarding 1) This seems to be time dependent but it is seen on my 
local windows running the unit test and on a linux server running Solr.
Regarding 2) The test does not show the number of Watchers are 
increasing, but this can be observed either by dumping the memory from 
the jvm or by looking at the debug statements (if debug is enabled).


I don't know how to make assert statements regarding the number of 
watchers in zookeeper, so the test is not quite informative, but more 
confirming that the fix doesn't destroy anything.


Best regards Trym

Den 27-06-2012 10:06, Sami Siren skrev:
On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk 
wrote:

Hi

Hi,


The behaviour of this can be verified using the below test in the
org.apache.solr.cloud.LeaderElectionIntegrationTest

Can you reproduce the failure in your test every time or just rarely?
I added the test method to LeaderElectionIntegrationTest and ran it
few times but I can't get it to fail.

--
  Sami Siren








Re: LeaderElection bugfix

2012-06-27 Thread Sami Siren
ok,

I see what you mean. Looks to me that you're right. I am not too
familiar with the LeaderElector so I'll let Mark take a second look.

--
 Sami Siren

On Wed, Jun 27, 2012 at 11:32 AM, Trym R. Møller t...@sigmat.dk wrote:
 Hi Sami

 Regarding 2) A simple way to inspect the number of watchers, is to add an
 error log statement to the process method of the watcher
  public void process(WatchedEvent event) {
log.error(seq +  watcher received event:  + event);
 and see that the number of logs doubles for each call to
 containerMap.get(replicaPort).getZkController().getZkClient().getSolrZooKeeper().pauseCnxn(2000);

 Best regards Trym

 Den 27-06-2012 10:14, Trym R. Møller skrev:

 Hi Sami

 Thanks for your rapid reply.

 Regarding 1) This seems to be time dependent but it is seen on my local
 windows running the unit test and on a linux server running Solr.
 Regarding 2) The test does not show the number of Watchers are increasing,
 but this can be observed either by dumping the memory from the jvm or by
 looking at the debug statements (if debug is enabled).

 I don't know how to make assert statements regarding the number of
 watchers in zookeeper, so the test is not quite informative, but more
 confirming that the fix doesn't destroy anything.

 Best regards Trym

 Den 27-06-2012 10:06, Sami Siren skrev:

 On Wed, Jun 27, 2012 at 10:32 AM, Trym R. Møller t...@sigmat.dk
 wrote:

 Hi

 Hi,

 The behaviour of this can be verified using the below test in the
 org.apache.solr.cloud.LeaderElectionIntegrationTest

 Can you reproduce the failure in your test every time or just rarely?
 I added the test method to LeaderElectionIntegrationTest and ran it
 few times but I can't get it to fail.

 --
  Sami Siren







Doubt (Lower/upper case issue)

2012-06-27 Thread logu
I want to search a word which may be in lower/upper case but the result
should be in both. I mean both cases should be include in the search.
what should I change in my configuration of solr.


SV: Doubt (Lower/upper case issue)

2012-06-27 Thread Mikael Jagekrans
You should add LowerCaseFilterFactory factories to both the index analyzer and 
the query analyzer in your fieldType declaration in the schema file.

They will convert both the index and queries to lowercase which will give you 
case insensitive results.



Mikael Jagekrans

Software Engineer



mikae...@comaround.se

Tel: +46 8-580 886 48



Holländargatan 13, 111 36 Stockholm, Sweden

http://www.comaround.se



-Ursprungligt meddelande-
Från: logu [mailto:sloganat...@chn.aithent.com]
Skickat: den 27 juni 2012 11:04
Till: solr-user@lucene.apache.org
Ämne: Doubt (Lower/upper case issue)



I want to search a word which may be in lower/upper case but the result

should be in both. I mean both cases should be include in the search.

what should I change in my configuration of solr.


how Solr/Lucene can support standard join operation

2012-06-27 Thread Robert Yu
The ability of join operation supported as what 
http://wiki.apache.org/solr/Join says is so limited.
I'm thinking how to support standard join operation in Solr/Lucene because not 
all can be de-normalized efficiently.

Take 2 schemas below as an example:

(1)Student
sid
name
cid// class id

(2)class

cid

name

major
In SQL, it will be easy to get all students' name and its class name where 
student's name start with 'p' and class's major is CS.
 Select s.name, c.name from student s, class c where s.name like 'p%' 
and c.major = CS.

How Solr/Lucene support the above query? It seems they do not.

Thanks,

Robert Yu
Application Service - Backend
Morningstar Shenzhen Ltd.
Morningstar. Illuminating investing worldwide.

+86 755 3311-0223 voice
+86 137-2377-0925 mobile
+86 755 - fax
robert...@morningstar.commailto:robert...@morningstar.com
8FL, Tower A, Donghai International Center ( or East Pacific International 
Center)
7888 Shennan Road, Futian district,
Shenzhen, Guangdong province, China 518040

http://cn.morningstar.comhttp://cn.morningstar.com/

This e-mail contains privileged and confidential information and is intended 
only for the use of the person(s) named above. Any dissemination, distribution, 
or duplication of this communication without prior written consent from 
Morningstar is strictly prohibited. If you have received this message in error, 
please contact the sender immediately and delete the materials from any 
computer.



Re: how Solr/Lucene can support standard join operation

2012-06-27 Thread Lee Carroll
In your example de-normalising would be fine in a vast number of
use-cases. multi value fields are fine.

If you really want to, see http://wiki.apache.org/solr/Join but make
sure you loose the default relational dba world view first
and only go down that route if you need to.



On 27 June 2012 12:27, Robert Yu robert...@morningstar.com wrote:
 The ability of join operation supported as what 
 http://wiki.apache.org/solr/Join says is so limited.
 I'm thinking how to support standard join operation in Solr/Lucene because 
 not all can be de-normalized efficiently.

 Take 2 schemas below as an example:

 (1)    Student
 sid
 name
 cid    // class id

 (2)    class

 cid

 name

 major
 In SQL, it will be easy to get all students' name and its class name where 
 student's name start with 'p' and class's major is CS.
         Select s.name, c.name from student s, class c where s.name like 'p%' 
 and c.major = CS.

 How Solr/Lucene support the above query? It seems they do not.

 Thanks,
 
 Robert Yu
 Application Service - Backend
 Morningstar Shenzhen Ltd.
 Morningstar. Illuminating investing worldwide.

 +86 755 3311-0223 voice
 +86 137-2377-0925 mobile
 +86 755 - fax
 robert...@morningstar.commailto:robert...@morningstar.com
 8FL, Tower A, Donghai International Center ( or East Pacific International 
 Center)
 7888 Shennan Road, Futian district,
 Shenzhen, Guangdong province, China 518040

 http://cn.morningstar.comhttp://cn.morningstar.com/

 This e-mail contains privileged and confidential information and is intended 
 only for the use of the person(s) named above. Any dissemination, 
 distribution, or duplication of this communication without prior written 
 consent from Morningstar is strictly prohibited. If you have received this 
 message in error, please contact the sender immediately and delete the 
 materials from any computer.



RE: split index horizontally (updated, a special join operation?)

2012-06-27 Thread Robert Yu
I think we can treat this as a special join operation.
Here are my some clues to support it.
1, build each group as a separate index
Index 1's name group1
Key
Group 1's fields
Index 2's name group2
Key
Group 2's fields
2, query looks like a RDBMS's join operation. For example
select g1.key, g.field1, g2.field1 from group1 g1, group2 g2 where 
g1.field1  1 AND (g1.field2  100 OR g2.field1  99).
3, how Solr/Lucene support the above query?
It looks like they do not support it.

I've two ideas of its solution.
First, is it possible to use the same docid for the same key in all indexes? If 
so, what we need do is to have a global docid generator which generate the same 
docid for the same key, and Hit contains index information (maybe like Segment).

I reviewed the source code of Lucene/Solr and found it seems docid are 
internally generated during building index, more important, some operation 
depends on its order. In another words, you can not give an document an smaller 
docid. Am I right?

Second, let score merge result by key rather than by docid. Of course, it is 
not efficient as by docid. Since Lucene had build index, I think it should 
still be fast enough.
I'd like to hear your opinions on this topic.

Thanks,
-Original Message-
From: Robert Yu [mailto:robert...@morningstar.com] 
Sent: Friday, September 30, 2011 9:54 AM
To: solr-user@lucene.apache.org
Subject: split index horizontally

Is there a efficient way to handle my case?

Each document has several group fields, some of them are updated frequently, 
some of them are updated infrequently. Is it possible to maintain index based 
on groups but can search over all of them as ONE index?

 

To some extent, it is a three layer of document (I think the current is two 
layer):

document = {key: groups},...

groups = {group-name: fields},...

fields = {field-name: field-value},...

 

we can maintain index for each group, and can search it like below:

   query: group-name-1:field-1:val-1 AND (
group-name-2:field-2:val-2 OR group-name-3:field-3:[min-3 TO max-3])

   return data:
group-name-1:field-1,field-2;groupd-name-2:field-3,field-4,...

 

Thanks,

Robert Yu

 



RE: how Solr/Lucene can support standard join operation

2012-06-27 Thread Robert Yu
The point is sometimes data after de-normalization will be huge, in some case, 
it's even impossible.
Thanks,
-Original Message-
From: Lee Carroll [mailto:lee.a.carr...@googlemail.com] 
Sent: Wednesday, June 27, 2012 7:38 PM
To: solr-user@lucene.apache.org
Subject: Re: how Solr/Lucene can support standard join operation

In your example de-normalising would be fine in a vast number of use-cases. 
multi value fields are fine.

If you really want to, see http://wiki.apache.org/solr/Join but make sure you 
loose the default relational dba world view first and only go down that route 
if you need to.



On 27 June 2012 12:27, Robert Yu robert...@morningstar.com wrote:
 The ability of join operation supported as what 
 http://wiki.apache.org/solr/Join says is so limited.
 I'm thinking how to support standard join operation in Solr/Lucene because 
 not all can be de-normalized efficiently.

 Take 2 schemas below as an example:

 (1)    Student
 sid
 name
 cid    // class id

 (2)    class

 cid

 name

 major
 In SQL, it will be easy to get all students' name and its class name where 
 student's name start with 'p' and class's major is CS.
         Select s.name, c.name from student s, class c where s.name like 'p%' 
 and c.major = CS.

 How Solr/Lucene support the above query? It seems they do not.

 Thanks,
 
 Robert Yu
 Application Service - Backend
 Morningstar Shenzhen Ltd.
 Morningstar. Illuminating investing worldwide.

 +86 755 3311-0223 voice
 +86 137-2377-0925 mobile
 +86 755 - fax
 robert...@morningstar.commailto:robert...@morningstar.com
 8FL, Tower A, Donghai International Center ( or East Pacific 
 International Center)
 7888 Shennan Road, Futian district,
 Shenzhen, Guangdong province, China 518040

 http://cn.morningstar.comhttp://cn.morningstar.com/

 This e-mail contains privileged and confidential information and is intended 
 only for the use of the person(s) named above. Any dissemination, 
 distribution, or duplication of this communication without prior written 
 consent from Morningstar is strictly prohibited. If you have received this 
 message in error, please contact the sender immediately and delete the 
 materials from any computer.



Re: how Solr/Lucene can support standard join operation

2012-06-27 Thread Lee Carroll
Sorry you have that link! and I did not see the question - apols

index schema could look something like:

id
name
classList - multi value
majorClassList - multi value

a standard query would do the equivalent of your sql

again apols for not seeing the link

lee c



On 27 June 2012 12:37, Lee Carroll lee.a.carr...@googlemail.com wrote:
 In your example de-normalising would be fine in a vast number of
 use-cases. multi value fields are fine.

 If you really want to, see http://wiki.apache.org/solr/Join but make
 sure you loose the default relational dba world view first
 and only go down that route if you need to.



 On 27 June 2012 12:27, Robert Yu robert...@morningstar.com wrote:
 The ability of join operation supported as what 
 http://wiki.apache.org/solr/Join says is so limited.
 I'm thinking how to support standard join operation in Solr/Lucene because 
 not all can be de-normalized efficiently.

 Take 2 schemas below as an example:

 (1)    Student
 sid
 name
 cid    // class id

 (2)    class

 cid

 name

 major
 In SQL, it will be easy to get all students' name and its class name where 
 student's name start with 'p' and class's major is CS.
         Select s.name, c.name from student s, class c where s.name like 'p%' 
 and c.major = CS.

 How Solr/Lucene support the above query? It seems they do not.

 Thanks,
 
 Robert Yu
 Application Service - Backend
 Morningstar Shenzhen Ltd.
 Morningstar. Illuminating investing worldwide.

 +86 755 3311-0223 voice
 +86 137-2377-0925 mobile
 +86 755 - fax
 robert...@morningstar.commailto:robert...@morningstar.com
 8FL, Tower A, Donghai International Center ( or East Pacific International 
 Center)
 7888 Shennan Road, Futian district,
 Shenzhen, Guangdong province, China 518040

 http://cn.morningstar.comhttp://cn.morningstar.com/

 This e-mail contains privileged and confidential information and is intended 
 only for the use of the person(s) named above. Any dissemination, 
 distribution, or duplication of this communication without prior written 
 consent from Morningstar is strictly prohibited. If you have received this 
 message in error, please contact the sender immediately and delete the 
 materials from any computer.



Re: Solr seems to hang

2012-06-27 Thread Arkadi Colson

Anybody an idea?

The thread Dump looks like this:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):

http-8983-6 daemon prio=10 tid=0x41126000 nid=0x5c1 in 
Object.wait() [0x7fa0ad197000]

   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070abf4ad0 (a 
org.apache.tomcat.util.net.JIoEndpoint$Worker)

at java.lang.Object.wait(Object.java:485)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
- locked 0x00070abf4ad0 (a 
org.apache.tomcat.util.net.JIoEndpoint$Worker)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)

at java.lang.Thread.run(Thread.java:662)

pool-4-thread-1 prio=10 tid=0x7fa0a054d800 nid=0x5be waiting on 
condition [0x7f9f962f4000]

   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000702598b30 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)

at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:662)

http-8983-5 daemon prio=10 tid=0x412d2800 nid=0x5bd runnable 
[0x7f9f94171000]

   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at 
org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:735)
at 
org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:366)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:814)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)

at java.lang.Thread.run(Thread.java:662)

http-8983-4 daemon prio=10 tid=0x41036000 nid=0x5b1 in 
Object.wait() [0x7f9f966c9000]

   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070b6e4790 (a 
org.apache.lucene.index.DocumentsWriter)

at java.lang.Object.wait(Object.java:485)
at 
org.apache.lucene.index.DocumentsWriter.waitIdle(DocumentsWriter.java:986)
- locked 0x00070b6e4790 (a 
org.apache.lucene.index.DocumentsWriter)
at 
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:524)
- locked 0x00070b6e4790 (a 
org.apache.lucene.index.DocumentsWriter)
at 
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580)
- locked 0x00070b6e4858 (a 
org.apache.solr.update.SolrIndexWriter)

at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2328)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2293)
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:240)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:141)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:146)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:236)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at 

Re: Solr seems to hang

2012-06-27 Thread Li Li
seems that the indexwriter wants to flush but need to wait others become
idle. but i see you the n gram filter is working. is your field's value too
long? you sould also tell us average load the system. the free memory and
memory used by jvm
在 2012-6-27 晚上7:51,Arkadi Colson ark...@smartbit.be写道:

 Anybody an idea?

 The thread Dump looks like this:

 Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):

 http-8983-6 daemon prio=10 tid=0x41126000 nid=0x5c1 in
 Object.wait() [0x7fa0ad197000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070abf4ad0 (a org.apache.tomcat.util.net.**
 JIoEndpoint$Worker)
at java.lang.Object.wait(Object.**java:485)
at org.apache.tomcat.util.net.**JIoEndpoint$Worker.await(**
 JIoEndpoint.java:458)
- locked 0x00070abf4ad0 (a org.apache.tomcat.util.net.**
 JIoEndpoint$Worker)
at org.apache.tomcat.util.net.**JIoEndpoint$Worker.run(**
 JIoEndpoint.java:484)
at java.lang.Thread.run(Thread.**java:662)

 pool-4-thread-1 prio=10 tid=0x7fa0a054d800 nid=0x5be waiting on
 condition [0x7f9f962f4000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000702598b30 (a
 java.util.concurrent.locks.**AbstractQueuedSynchronizer$**ConditionObject)
at java.util.concurrent.locks.**LockSupport.park(LockSupport.**
 java:158)
at java.util.concurrent.locks.**AbstractQueuedSynchronizer$**
 ConditionObject.await(**AbstractQueuedSynchronizer.**java:1987)
at java.util.concurrent.**DelayQueue.take(DelayQueue.**java:160)
at java.util.concurrent.**ScheduledThreadPoolExecutor$**
 DelayedWorkQueue.take(**ScheduledThreadPoolExecutor.**java:609)
at java.util.concurrent.**ScheduledThreadPoolExecutor$**
 DelayedWorkQueue.take(**ScheduledThreadPoolExecutor.**java:602)
at java.util.concurrent.**ThreadPoolExecutor.getTask(**
 ThreadPoolExecutor.java:947)
at java.util.concurrent.**ThreadPoolExecutor$Worker.run(**
 ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.**java:662)

 http-8983-5 daemon prio=10 tid=0x412d2800 nid=0x5bd runnable
 [0x7f9f94171000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.**socketRead0(Native Method)
at java.net.SocketInputStream.**read(SocketInputStream.java:**129)
at org.apache.coyote.http11.**InternalInputBuffer.fill(**
 InternalInputBuffer.java:735)
at org.apache.coyote.http11.**InternalInputBuffer.**
 parseRequestLine(**InternalInputBuffer.java:366)
at org.apache.coyote.http11.**Http11Processor.process(**
 Http11Processor.java:814)
at org.apache.coyote.http11.**Http11Protocol$**
 Http11ConnectionHandler.**process(Http11Protocol.java:**602)
at org.apache.tomcat.util.net.**JIoEndpoint$Worker.run(**
 JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.**java:662)

 http-8983-4 daemon prio=10 tid=0x41036000 nid=0x5b1 in
 Object.wait() [0x7f9f966c9000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070b6e4790 (a org.apache.lucene.index.**
 DocumentsWriter)
at java.lang.Object.wait(Object.**java:485)
at org.apache.lucene.index.**DocumentsWriter.waitIdle(**
 DocumentsWriter.java:986)
- locked 0x00070b6e4790 (a org.apache.lucene.index.**
 DocumentsWriter)
at org.apache.lucene.index.**DocumentsWriter.flush(**
 DocumentsWriter.java:524)
- locked 0x00070b6e4790 (a org.apache.lucene.index.**
 DocumentsWriter)
at org.apache.lucene.index.**IndexWriter.doFlush(**
 IndexWriter.java:3580)
- locked 0x00070b6e4858 (a org.apache.solr.update.**
 SolrIndexWriter)
at org.apache.lucene.index.**IndexWriter.flush(IndexWriter.**
 java:3545)
at org.apache.lucene.index.**IndexWriter.updateDocument(**
 IndexWriter.java:2328)
at org.apache.lucene.index.**IndexWriter.updateDocument(**
 IndexWriter.java:2293)
at org.apache.solr.update.**DirectUpdateHandler2.addDoc(**
 DirectUpdateHandler2.java:240)
at org.apache.solr.update.**processor.RunUpdateProcessor.**
 processAdd(**RunUpdateProcessorFactory.**java:61)
at org.apache.solr.update.**processor.LogUpdateProcessor.**
 processAdd(**LogUpdateProcessorFactory.**java:115)
at org.apache.solr.handler.**extraction.**ExtractingDocumentLoader.
 **doAdd(**ExtractingDocumentLoader.java:**141)
at org.apache.solr.handler.**extraction.**ExtractingDocumentLoader.
 **addDoc(**ExtractingDocumentLoader.java:**146)
at org.apache.solr.handler.**extraction.**
 ExtractingDocumentLoader.load(**ExtractingDocumentLoader.java:**236)
at org.apache.solr.handler.**ContentStreamHandlerBase.**
 handleRequestBody(**ContentStreamHandlerBase.java:**58)
  

Re: Solr seems to hang

2012-06-27 Thread Erick Erickson
How long is it hanging? And how are you sending files to Tika, and
especially how often do you commit? One problem that people
run into is that they commit too often, causing segments to be
merged and occasionally that just takes a while and people
think that Solr is hung.

18G isn't very large as indexes go, so it's unlikely that's your problem,
except if merging is going on in which case you might be copying a bunch
of data. So try seeing if you're getting a bunch of disk activity, you can get
a crude idea of what's going on if you just look at the index directory on
your Solr server while it's hung.

What version of Solr are you using? Details matter

Best
Erick

On Wed, Jun 27, 2012 at 7:51 AM, Arkadi Colson ark...@smartbit.be wrote:
 Anybody an idea?

 The thread Dump looks like this:

 Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):

 http-8983-6 daemon prio=10 tid=0x41126000 nid=0x5c1 in
 Object.wait() [0x7fa0ad197000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on 0x00070abf4ad0 (a
 org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at java.lang.Object.wait(Object.java:485)
        at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
        - locked 0x00070abf4ad0 (a
 org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
        at java.lang.Thread.run(Thread.java:662)

 pool-4-thread-1 prio=10 tid=0x7fa0a054d800 nid=0x5be waiting on
 condition [0x7f9f962f4000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  0x000702598b30 (a
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
        at
 java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
        at
 java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
        at
 java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:662)

 http-8983-5 daemon prio=10 tid=0x412d2800 nid=0x5bd runnable
 [0x7f9f94171000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at
 org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:735)
        at
 org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:366)
        at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:814)
        at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
        at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)

 http-8983-4 daemon prio=10 tid=0x41036000 nid=0x5b1 in
 Object.wait() [0x7f9f966c9000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on 0x00070b6e4790 (a
 org.apache.lucene.index.DocumentsWriter)
        at java.lang.Object.wait(Object.java:485)
        at
 org.apache.lucene.index.DocumentsWriter.waitIdle(DocumentsWriter.java:986)
        - locked 0x00070b6e4790 (a
 org.apache.lucene.index.DocumentsWriter)
        at
 org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:524)
        - locked 0x00070b6e4790 (a
 org.apache.lucene.index.DocumentsWriter)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580)
        - locked 0x00070b6e4858 (a
 org.apache.solr.update.SolrIndexWriter)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
        at
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2328)
        at
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2293)
        at
 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:240)
        at
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
        at
 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:115)
        at
 org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:141)
        at
 

Re: Solr seems to hang

2012-06-27 Thread Arkadi Colson
I've set the maxFieldLength to the maximum because I'm indexing 
documents which can be quite big:


maxFieldLength2147483647/maxFieldLength

Load average is about 0.9 but CPU is running at 35% percent. Probably 
because tika has to extract the documents


The virtual machine is having 4 CPU's (2.67GHz each) with 12 GB of 
memory. Tomcat is configured like this:

-Xms2048M -Xmx4096M

If you need more information, please let me know!

On 06/27/2012 03:09 PM, Li Li wrote:


seems that the indexwriter wants to flush but need to wait others 
become idle. but i see you the n gram filter is working. is your 
field's value too long? you sould also tell us average load the 
system. the free memory and memory used by jvm


在 2012-6-27 晚上7:51,Arkadi Colson ark...@smartbit.be 
mailto:ark...@smartbit.be 写道:


Anybody an idea?

The thread Dump looks like this:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed
mode):

http-8983-6 daemon prio=10 tid=0x41126000 nid=0x5c1 in
Object.wait() [0x7fa0ad197000]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0x00070abf4ad0 (a
org.apache.tomcat.util.net
http://org.apache.tomcat.util.net.JIoEndpoint$Worker)
   at java.lang.Object.wait(Object.java:485)
   at org.apache.tomcat.util.net

http://org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
   - locked 0x00070abf4ad0 (a org.apache.tomcat.util.net
http://org.apache.tomcat.util.net.JIoEndpoint$Worker)
   at org.apache.tomcat.util.net

http://org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
   at java.lang.Thread.run(Thread.java:662)

pool-4-thread-1 prio=10 tid=0x7fa0a054d800 nid=0x5be waiting
on condition [0x7f9f962f4000]
  java.lang.Thread.State: WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  0x000702598b30 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
   at

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
   at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
   at

java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
   at

java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
   at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
   at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:662)

http-8983-5 daemon prio=10 tid=0x412d2800 nid=0x5bd
runnable [0x7f9f94171000]
  java.lang.Thread.State: RUNNABLE
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:129)
   at

org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:735)
   at

org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:366)
   at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:814)
   at

org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
   at org.apache.tomcat.util.net

http://org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
   at java.lang.Thread.run(Thread.java:662)

http-8983-4 daemon prio=10 tid=0x41036000 nid=0x5b1 in
Object.wait() [0x7f9f966c9000]
  java.lang.Thread.State: WAITING (on object monitor)
   at java.lang.Object.wait(Native Method)
   - waiting on 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
   at java.lang.Object.wait(Object.java:485)
   at
org.apache.lucene.index.DocumentsWriter.waitIdle(DocumentsWriter.java:986)
   - locked 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
   at
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:524)
   - locked 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
   at
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580)
   - locked 0x00070b6e4858 (a
org.apache.solr.update.SolrIndexWriter)
   at
org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
   at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2328)
   at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2293)
   at


Re: Solr seems to hang

2012-06-27 Thread Arkadi Colson
I'm sending files to solr with the php Solr library. I'm doing a commit 
every 1000 documents:

   autoCommit
 maxDocs1000/maxDocs
!-- maxTime1000/maxTime --
   /autoCommit

Hard to say how long it's hanging. At least for 1 hour. After that I 
restarted Tomcat to continue... I will have a look at the indexes next 
time it's hanging. Thanks for the tip!


SOLR: 3.6
TOMCAT: 7.0.28
JAVA: 1.7.0_05-b05


On 06/27/2012 03:13 PM, Erick Erickson wrote:

How long is it hanging? And how are you sending files to Tika, and
especially how often do you commit? One problem that people
run into is that they commit too often, causing segments to be
merged and occasionally that just takes a while and people
think that Solr is hung.

18G isn't very large as indexes go, so it's unlikely that's your problem,
except if merging is going on in which case you might be copying a bunch
of data. So try seeing if you're getting a bunch of disk activity, you can get
a crude idea of what's going on if you just look at the index directory on
your Solr server while it's hung.

What version of Solr are you using? Details matter

Best
Erick

On Wed, Jun 27, 2012 at 7:51 AM, Arkadi Colson ark...@smartbit.be wrote:

Anybody an idea?

The thread Dump looks like this:

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):

http-8983-6 daemon prio=10 tid=0x41126000 nid=0x5c1 in
Object.wait() [0x7fa0ad197000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070abf4ad0 (a
org.apache.tomcat.util.net.JIoEndpoint$Worker)
at java.lang.Object.wait(Object.java:485)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
- locked 0x00070abf4ad0 (a
org.apache.tomcat.util.net.JIoEndpoint$Worker)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
at java.lang.Thread.run(Thread.java:662)

pool-4-thread-1 prio=10 tid=0x7fa0a054d800 nid=0x5be waiting on
condition [0x7f9f962f4000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000702598b30 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
at
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)

http-8983-5 daemon prio=10 tid=0x412d2800 nid=0x5bd runnable
[0x7f9f94171000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at
org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:735)
at
org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:366)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:814)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)

http-8983-4 daemon prio=10 tid=0x41036000 nid=0x5b1 in
Object.wait() [0x7f9f966c9000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
at java.lang.Object.wait(Object.java:485)
at
org.apache.lucene.index.DocumentsWriter.waitIdle(DocumentsWriter.java:986)
- locked 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
at
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:524)
- locked 0x00070b6e4790 (a
org.apache.lucene.index.DocumentsWriter)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3580)
- locked 0x00070b6e4858 (a
org.apache.solr.update.SolrIndexWriter)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3545)
at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2328)
at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2293)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:240)
  

How to index and search string which contains double quotes ()

2012-06-27 Thread ravicv
Hi,

*My input string is *: Hi how r u Test
I need to index this input text with double quotes. but solr is removing
double quotes while indexing .

I am using *string *as the data type

if test is searched then i am able to get result as Hi how r u Test (without
double quotes)

How to get search result as the input string as  Hi how r u Test when i
searched for test

Thanks
Ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-index-and-search-string-which-contains-double-quotes-tp3991586.html
Sent from the Solr - User mailing list archive at Nabble.com.


Saravanan Chinnadurai/Actionimages is out of the office.

2012-06-27 Thread Saravanan . Chinnadurai
I will be out of the office starting  26/06/2012 and will not return until
28/06/2012.

Please email to itsta...@actionimages.com  for any urgent issues.


Action Images is a division of Reuters Limited and your data will therefore be 
protected
in accordance with the Reuters Group Privacy / Data Protection notice which is 
available
in the privacy footer at www.reuters.com
Registered in England No. 145516   VAT REG: 397000555


Antonyms configuration

2012-06-27 Thread RajParakh
Hi,

I need to specify an antonym list - similar to synonym list.
Whats the best way to go about it?


Currently, I am firing - RegularLuceneQuery AND (NOT keyword)
Example :Antonym list has four words - A, B1,B2,B3
A X B1
A X B2
A X B3

User Query contains 'A'
Expected result set: Documents NOT containing any of the words B1,B2,B3.
So the lucene query I am firing is - RegularLuceneQuery AND (NOT (B1 OR B2
OR B3)

Is there a cleaner way? Antonym list is growing ..

Thanks,
Raj

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Antonyms-configuration-tp3991595.html
Sent from the Solr - User mailing list archive at Nabble.com.


[HIRING] Lucid Imagination seeks Senior Consultants

2012-06-27 Thread Erik Hatcher
My company, Lucid Imagination, is actively seeking full-time (and contract as 
skills/needs/availability align) professional service technologists.  Details 
can be found here: 
http://www.lucidimagination.com/about/careers/senior-consultant-position

I'll put in my personal bit and say that Lucid employs many of the folks you 
frequently see here in the open source community, and you'd be privileged to 
work alongside some of the smartest people in the field.  And you'll get the 
honor of working with our ever increasing customer base which includes many 
recognizable names.

Please feel free to share our openings with any of your colleagues or friends 
that fit our needs.

Thanks,
Erik



Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 Why would the documentCache not be populated via firstSearcher warming 
 queries with a non-zero value for rows?

Solr streams documents (the stored fields) returned to the user (so
very large result sets can be supported w/o having the whole thing in
memory).
A warming query finds the document ids matching a query, but does not
send them anywhere (and the stored fields aren't needed for anything
else), hence the stored fields are never loaded.

-Yonik
http://lucidimagination.com


question about jmx value (avgRequestsPerSecond) output from solr

2012-06-27 Thread geeky2
hello all,

environment: centOS, solr 3.5, jboss 5.1

i have been using wily (a monitoring tool) to instrument our solr instances
in stress.

can someone help me to understand something about the jmx values being
output from solr?  please note - i am new to JMX.

problem / issue statement: for a given request handler (partItemDescSearch),
i see output from the jmx MBean for the metric avgRequestsPerSecond - AFTER
my test harness has completed and there is NO request activity to this
request handler - taking place (verified in solr log files).

example scenario during testing:  during a test run - the test harness will
fire requests at request handler (partItemDescSearch) and all numbers look
fine.   then after the test harness is done - the metric
avgRequestsPerSecond does not immediately drop to 0.  instead - it appears
as if JMX is somehow averaging this metric and gradually trending it
downward toward 0.

continual checking of this metric (in the JMX tree - see screen shot) shows
the number trending downward instead of a hard stop at 0.

is this behavior - just the way jmx works?

thanks mark

http://lucene.472066.n3.nabble.com/file/n3991616/test1.bmp 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-about-jmx-value-avgRequestsPerSecond-output-from-solr-tp3991616.html
Sent from the Solr - User mailing list archive at Nabble.com.


Wildcard queries on whole words

2012-06-27 Thread Klostermeyer, Michael
I am researching an issue w/ wildcard searches on complete words in 3.5.  For 
example, searching for kloster* returns klostermeyer, but klostermeyer* 
returns nothing.

The field being queried has the following analysis chain (standard 
'text_general'):

fieldType name=text_general class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

I see that wildcard queries are not analyzed at query time, which could be the 
source of my issue, but I read conflicting advice on the interwebs.  I read 
also that this might have changed in 3.6, but I am unable to determine if my 
specific issue is addressed.

My questions:

1.   Why am I getting these search results with my current config?

2.   How do I fix it in 3.5?  Would upgrading to 3.6 also fix my issue?

Thanks!

Mike Klostermeyer



Re: SolrCloud cache warming issues

2012-06-27 Thread Erik Hatcher

On Jun 27, 2012, at 12:01 , Yonik Seeley wrote:

 On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
 markus.jel...@openindex.io wrote:
 Why would the documentCache not be populated via firstSearcher warming 
 queries with a non-zero value for rows?
 
 Solr streams documents (the stored fields) returned to the user (so
 very large result sets can be supported w/o having the whole thing in
 memory).
 A warming query finds the document ids matching a query, but does not
 send them anywhere (and the stored fields aren't needed for anything
 else), hence the stored fields are never loaded.


But if highlighting were enabled on those warming queries, it'd fill in the 
document cache, right?

Erik



Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote:

 On Jun 27, 2012, at 12:01 , Yonik Seeley wrote:

 On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
 markus.jel...@openindex.io wrote:
 Why would the documentCache not be populated via firstSearcher warming 
 queries with a non-zero value for rows?

 Solr streams documents (the stored fields) returned to the user (so
 very large result sets can be supported w/o having the whole thing in
 memory).
 A warming query finds the document ids matching a query, but does not
 send them anywhere (and the stored fields aren't needed for anything
 else), hence the stored fields are never loaded.


 But if highlighting were enabled on those warming queries, it'd fill in the 
 document cache, right?

Correct.

-Yonik
http://lucidimagination.com


RE: SolrCloud cache warming issues

2012-06-27 Thread Markus Jelsma
Interesting!

We also tried routing the warming queries through our main search request 
handler, with highlighting enabled, that has distrib=true as default. To 
prevent warming queries to run over the cluster on all instances we set 
distrib=false in the warming queries. The queries were fired at start up but 
the Solr instance stays unreachable from the outside. It caused an awful amount 
of socket time out exceptions.

How is warming on a cluster supposed to behave? Is distrib=false enforced if it 
is a default for the used handler? 

Thanks

 
 
-Original message-
 From:Yonik Seeley yo...@lucidimagination.com
 Sent: Wed 27-Jun-2012 18:27
 To: solr-user@lucene.apache.org
 Subject: Re: SolrCloud cache warming issues
 
 On Wed, Jun 27, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote:
 
  On Jun 27, 2012, at 12:01 , Yonik Seeley wrote:
 
  On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma
  markus.jel...@openindex.io wrote:
  Why would the documentCache not be populated via firstSearcher warming 
  queries with a non-zero value for rows?
 
  Solr streams documents (the stored fields) returned to the user (so
  very large result sets can be supported w/o having the whole thing in
  memory).
  A warming query finds the document ids matching a query, but does not
  send them anywhere (and the stored fields aren't needed for anything
  else), hence the stored fields are never loaded.
 
 
  But if highlighting were enabled on those warming queries, it'd fill in the 
  document cache, right?
 
 Correct.
 
 -Yonik
 http://lucidimagination.com
 


Re: Wildcard queries on whole words

2012-06-27 Thread Michael Della Bitta
Hi Michael,

I solved a similar issue by reformatting my query to do an OR across
an exact match or a wildcard query, with the exact match boosted.

HTH,

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael
mklosterme...@riskexchange.com wrote:
 I am researching an issue w/ wildcard searches on complete words in 3.5.  For 
 example, searching for kloster* returns klostermeyer, but klostermeyer* 
 returns nothing.

 The field being queried has the following analysis chain (standard 
 'text_general'):

 fieldType name=text_general class=solr.TextField 
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true/
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
 /fieldType

 I see that wildcard queries are not analyzed at query time, which could be 
 the source of my issue, but I read conflicting advice on the interwebs.  I 
 read also that this might have changed in 3.6, but I am unable to determine 
 if my specific issue is addressed.

 My questions:

 1.       Why am I getting these search results with my current config?

 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also fix my issue?

 Thanks!

 Mike Klostermeyer



RE: Wildcard queries on whole words

2012-06-27 Thread Klostermeyer, Michael
Interesting solution.  Can you then explain to me for a given query:

?q='kloster' OR kloster*

How the exact match part of that is boosted (assuming the above is how you 
formulated your query)?

Thanks!

Mike

-Original Message-
From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] 
Sent: Wednesday, June 27, 2012 11:11 AM
To: solr-user@lucene.apache.org
Subject: Re: Wildcard queries on whole words

Hi Michael,

I solved a similar issue by reformatting my query to do an OR across an exact 
match or a wildcard query, with the exact match boosted.

HTH,

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn't a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
mklosterme...@riskexchange.com wrote:
 I am researching an issue w/ wildcard searches on complete words in 3.5.  For 
 example, searching for kloster* returns klostermeyer, but klostermeyer* 
 returns nothing.

 The field being queried has the following analysis chain (standard 
 'text_general'):

 fieldType name=text_general class=solr.TextField 
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.SynonymFilterFactory 
 synonyms=synonyms.txt ignoreCase=true expand=true/
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
 /fieldType

 I see that wildcard queries are not analyzed at query time, which could be 
 the source of my issue, but I read conflicting advice on the interwebs.  I 
 read also that this might have changed in 3.6, but I am unable to determine 
 if my specific issue is addressed.

 My questions:

 1.       Why am I getting these search results with my current config?

 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also fix my issue?

 Thanks!

 Mike Klostermeyer



Re: Wildcard queries on whole words

2012-06-27 Thread Erick Erickson
q=kloster^3 OR kloster*

On Wed, Jun 27, 2012 at 2:16 PM, Klostermeyer, Michael
mklosterme...@riskexchange.com wrote:
 Interesting solution.  Can you then explain to me for a given query:

 ?q='kloster' OR kloster*

 How the exact match part of that is boosted (assuming the above is how you 
 formulated your query)?

 Thanks!

 Mike

 -Original Message-
 From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com]
 Sent: Wednesday, June 27, 2012 11:11 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Wildcard queries on whole words

 Hi Michael,

 I solved a similar issue by reformatting my query to do an OR across an exact 
 match or a wildcard query, with the exact match boosted.

 HTH,

 Michael Della Bitta

 
 Appinions, Inc. -- Where Influence Isn't a Game.
 http://www.appinions.com


 On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
 mklosterme...@riskexchange.com wrote:
 I am researching an issue w/ wildcard searches on complete words in 3.5.  
 For example, searching for kloster* returns klostermeyer, but 
 klostermeyer* returns nothing.

 The field being queried has the following analysis chain (standard 
 'text_general'):

 fieldType name=text_general class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt ignoreCase=true expand=true/
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
 /fieldType

 I see that wildcard queries are not analyzed at query time, which could be 
 the source of my issue, but I read conflicting advice on the interwebs.  I 
 read also that this might have changed in 3.6, but I am unable to determine 
 if my specific issue is addressed.

 My questions:

 1.       Why am I getting these search results with my current config?

 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also fix my issue?

 Thanks!

 Mike Klostermeyer



Re: Wildcard queries on whole words

2012-06-27 Thread Michael Della Bitta
We're doing:

?'kloster'^2 OR kloster*

This is for a homegrown autocomplete index based on a database of
context-free terms, so we have kind of a weird use case.

Note that wildcard matches will all be scored the same, so you might
need to do something to order them to suit your needs. In our case,
we're storing the value length and sorting on that, among other
things, but YMMV.

Michael Della Bitta


Appinions, Inc. -- Where Influence Isn’t a Game.
http://www.appinions.com


On Wed, Jun 27, 2012 at 2:16 PM, Klostermeyer, Michael
mklosterme...@riskexchange.com wrote:
 Interesting solution.  Can you then explain to me for a given query:

 ?q='kloster' OR kloster*

 How the exact match part of that is boosted (assuming the above is how you 
 formulated your query)?

 Thanks!

 Mike

 -Original Message-
 From: Michael Della Bitta [mailto:michael.della.bi...@appinions.com]
 Sent: Wednesday, June 27, 2012 11:11 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Wildcard queries on whole words

 Hi Michael,

 I solved a similar issue by reformatting my query to do an OR across an exact 
 match or a wildcard query, with the exact match boosted.

 HTH,

 Michael Della Bitta

 
 Appinions, Inc. -- Where Influence Isn't a Game.
 http://www.appinions.com


 On Wed, Jun 27, 2012 at 12:14 PM, Klostermeyer, Michael 
 mklosterme...@riskexchange.com wrote:
 I am researching an issue w/ wildcard searches on complete words in 3.5.  
 For example, searching for kloster* returns klostermeyer, but 
 klostermeyer* returns nothing.

 The field being queried has the following analysis chain (standard 
 'text_general'):

 fieldType name=text_general class=solr.TextField
 positionIncrementGap=100
      analyzer type=index
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
      analyzer type=query
        tokenizer class=solr.StandardTokenizerFactory/
        filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
        filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt ignoreCase=true expand=true/
        filter class=solr.LowerCaseFilterFactory/
      /analyzer
 /fieldType

 I see that wildcard queries are not analyzed at query time, which could be 
 the source of my issue, but I read conflicting advice on the interwebs.  I 
 read also that this might have changed in 3.6, but I am unable to determine 
 if my specific issue is addressed.

 My questions:

 1.       Why am I getting these search results with my current config?

 2.       How do I fix it in 3.5?  Would upgrading to 3.6 also fix my issue?

 Thanks!

 Mike Klostermeyer



Index Snapshot for 3.3?

2012-06-27 Thread garlandkr
How can I get a snapshot of the index in SOLR 3.x? 

I am currently taking EBS (Amazon) snapshots of the volume where the data is
form one machine and creating new volumes form that snapshot. When the
service starts it still runs through an indexing process that takes forever.
Is there a way to get a snapshot of the index and use that without having to
re-index on the new system?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Index-Snapshot-for-3-3-tp3991684.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Index Snapshot for 3.3?

2012-06-27 Thread Upayavira
Use once off replication, or, if you prefer, on unix do

 cp -lr your-index-dir your-backup-dir

at a time you know a commit isn't happening.

You'll have a clone of the index you can ship to another host. Remember
to delete your backup when done.

This uses the fact that files in a Lucene index never change, they're
only ever deleted and new ones added. Thus, using hard links to clone an
index works well.

Upayavira

On Wed, Jun 27, 2012, at 02:30 PM, garlandkr wrote:
 How can I get a snapshot of the index in SOLR 3.x? 
 
 I am currently taking EBS (Amazon) snapshots of the volume where the data
 is
 form one machine and creating new volumes form that snapshot. When the
 service starts it still runs through an indexing process that takes
 forever.
 Is there a way to get a snapshot of the index and use that without having
 to
 re-index on the new system?
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Index-Snapshot-for-3-3-tp3991684.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching for digits with strings

2012-06-27 Thread Upayavira
How many numbers? 0-9? Or every number under the sun?

You could achieve a limited number by using synonyms, 0 is a synonym for
nought and zero, etc.

Upayavira

On Wed, Jun 27, 2012, at 05:22 PM, Alireza Salimi wrote:
 Hi,
 
 I was wondering if there's a built in solution in Solr so that you can
 search for documents with digits by their string representations.
 i.e. search for 'two' would match fields which have '2' token and vice
 versa.
 
 Thanks
 
 -- 
 Alireza Salimi
 Java EE Developer


Re: Searching for digits with strings

2012-06-27 Thread Alireza Salimi
Hi,

Well that's the only solution I got so far and it would work for most of
the cases,
but l thought there might be some better solutions.

Thanks

On Wed, Jun 27, 2012 at 5:49 PM, Upayavira u...@odoko.co.uk wrote:

 How many numbers? 0-9? Or every number under the sun?

 You could achieve a limited number by using synonyms, 0 is a synonym for
 nought and zero, etc.

 Upayavira

 On Wed, Jun 27, 2012, at 05:22 PM, Alireza Salimi wrote:
  Hi,
 
  I was wondering if there's a built in solution in Solr so that you can
  search for documents with digits by their string representations.
  i.e. search for 'two' would match fields which have '2' token and vice
  versa.
 
  Thanks
 
  --
  Alireza Salimi
  Java EE Developer




-- 
Alireza Salimi
Java EE Developer


SSL Client Cert Keystore for Solr Replication config?

2012-06-27 Thread Matt Wise
Our Solr master server protects access to itself by requiring that the clients 
provide a signed SSL client cert from the same CA as the Solr server itself. 
This is all handled within an Nginx reverse-proxy thats on the Solr server 
itself.

This works great for clients... not so great for replication. We want to do 
replication access control the same way... but I have no idea how to get 
Tomcat/Solr to use a particular keypair when making outbound HTTPS requests to 
https://master/solr/replication. Any ideas?



Re: Searching for digits with strings

2012-06-27 Thread Sascha Szott
Hi,

as far as I know Solr does not provide such a feature. If you cannot make any 
assumptions on the numbers, choose an appropriate library that is able to 
transform between numerical and non-numerical representations and populate the 
search field with both versions at index-time.

-Sascha

Alireza Salimi alireza.sal...@gmail.com schrieb:

Hi,

Well that's the only solution I got so far and it would work for most of
the cases,
but l thought there might be some better solutions.

Thanks

On Wed, Jun 27, 2012 at 5:49 PM, Upayavira u...@odoko.co.uk wrote:

 How many numbers? 0-9? Or every number under the sun?

 You could achieve a limited number by using synonyms, 0 is a synonym for
 nought and zero, etc.

 Upayavira

 On Wed, Jun 27, 2012, at 05:22 PM, Alireza Salimi wrote:
  Hi,
 
  I was wondering if there's a built in solution in Solr so that you can
  search for documents with digits by their string representations.
  i.e. search for 'two' would match fields which have '2' token and vice
  versa.
 
  Thanks





Query Logic Question

2012-06-27 Thread Rublex
Hi,

Can someone explain to me please why these two queries return different
results:

1. -PaymentType:Finance AND -PaymentType:Lease AND -PaymentType:Cash *(700
results)*

2. (-PaymentType:Finance AND -PaymentType:Lease) AND -PaymentType:Cash *(0
results)*

Logically the two above queries should be return the same results no?

Thank you

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Logic-Question-tp3991689.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Antonyms configuration

2012-06-27 Thread Lee Carroll
have a field which uses a synonym file of your antonyms and a keep
word filter and use this field in your not query



On 27 June 2012 15:54, RajParakh rajpar...@gmail.com wrote:
 Hi,

 I need to specify an antonym list - similar to synonym list.
 Whats the best way to go about it?


 Currently, I am firing - RegularLuceneQuery AND (NOT keyword)
 Example :Antonym list has four words - A, B1,B2,B3
 A X B1
 A X B2
 A X B3

 User Query contains 'A'
 Expected result set: Documents NOT containing any of the words B1,B2,B3.
 So the lucene query I am firing is - RegularLuceneQuery AND (NOT (B1 OR B2
 OR B3)

 Is there a cleaner way? Antonym list is growing ..

 Thanks,
 Raj

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Antonyms-configuration-tp3991595.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Trying to avoid filtering on score, as I'm told that's bad

2012-06-27 Thread mcb
I have a function query that returns miles as a score along two points:

q={!func}sub(sum(geodist(OriginCoordinates,39,-105),geodist(DestinationCoordinates,36,-97),Mileage),1000)

The issue that I'm having now now my results give me a list of scores:
*score:10.1 (mi)
score: 20 (mi)
score: 75 (mi)
*
But I would like to also add a clause that cuts off the results after X
miles (say 50) so that 75 above would not be included in the results.
Unfortunately I can't say fq=score:[0 TO 50], but perhaps there is another
way? I'm on solr 4.0

Thanks!

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Trying-to-avoid-filtering-on-score-as-I-m-told-that-s-bad-tp3991696.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Trying to avoid filtering on score, as I'm told that's bad

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 6:50 PM, mcb thestreet...@gmail.com wrote:
 I have a function query that returns miles as a score along two points:

 q={!func}sub(sum(geodist(OriginCoordinates,39,-105),geodist(DestinationCoordinates,36,-97),Mileage),1000)

 The issue that I'm having now now my results give me a list of scores:
 *score:10.1 (mi)
 score: 20 (mi)
 score: 75 (mi)
 *
 But I would like to also add a clause that cuts off the results after X
 miles (say 50) so that 75 above would not be included in the results.
 Unfortunately I can't say fq=score:[0 TO 50], but perhaps there is another
 way? I'm on solr 4.0

If you want to cut off the whole function at 75, then frange can do that:
q={!frange u=75}sub(sum(...

http://lucene.apache.org/solr/api/org/apache/solr/search/FunctionRangeQParserPlugin.html

-Yonik
http://lucidimagination.com


Re: FastVectorHighlighter failure with multiValued fields

2012-06-27 Thread Lance Norskog
I think: text fields are not exactly multi-valued. Instead there is
something called the 'positionIncrementGap' which gives a sweep
(usually 100) of empty positions (terms) to distinguish one field from
the next. If you set this to zero or one, that should give you one
long multi-valued field.

2) You can do anything with javascript in DIH.
http://lucidworks.lucidimagination.com/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHandler-TheScriptTransformer

On Tue, Jun 26, 2012 at 6:31 AM, Duncan McIntyre dun...@calligram.co.uk wrote:
 I think I may have identified a bug with FVH. So I have two questions:

 1) Does anyone know how to make FVH return a highlighted snippet when the
 query matches all of one string in a multivalued field?
 2) If not, does anyone know how to make DIH concatenate all the values in a
 multivalued field into one single field?

 Imagine a document which looks like this:

 doc
  str name=department_nameObstetrics and Gynaecology/str
  arr name=node_names
    strRefer to specialist/str
    strIdentify adverse psycho social factors/str
  /arr
 /doc

 If I search the document and ask for matches to be highlighted with the
 original highlighter I get 'node_names' in the highlighting results

 q=node_names:(Refer to specialist)hl=true*hl.fl=*

 But if I repeat the search using the FVH, 'node_names' does not appear in
 the highlighting results

 q=node_names:(Refer to
 specialist)hl=true*hl.fl=*hl.useFastVectorHighlighter=true

 A search for something less than the full string (e.g. Refer to) works in
 both cases.

 I have tried every combination of hl.requireFieldMatch,
 hl.usePhraseHighlighter with no effect.

 node_names is defined as either:

 field name=node_names      type=text_en_splitting indexed=true
 stored=true multiValued=true termVectors=true termPositions=true
 termOffsets=true/


 OR:

   field name=node_names      type=text_en indexed=true
 stored=true multiValued=true termVectors=true termPositions=true
 termOffsets=true/

 And I have tried setting preserveOriginal=1 on the
 WordDelimiterFilterFactory.

 Now FVH seems to work fine with single-valued fields, so doing a query
 q=department_name:(Obstetrics and Gynaecology) works as expected. Given
 that, I have tried unsuccessfully to use either a Javascript or native Java
 transformer to merge the contents of node_names into a single
 node_names_flat field during data import. This fails because child entities
 have no access to their parent entity.

 entity name=pathway
  entity name=pages
    entity name=nodes
     -- produces multiple node_names and there seems to be no way to push
 them up into 'pages' or 'pathway'
    /entity
  /entity
 /entity

 Duncan.



-- 
Lance Norskog
goks...@gmail.com


Re: SSL Client Cert Keystore for Solr Replication config?

2012-06-27 Thread Lance Norskog
I believe this is what the Java 'keystore' is for. You give a Java VM
start option for the keyring file, and from then on outgoing sockets
use the certs for the target clients.

http://www.startux.de/index.php/java/44-dealing-with-java-keystoresyvComment44

http://www.lazgosoftware.com/kse/index.html

On Wed, Jun 27, 2012 at 3:03 PM, Matt Wise m...@nextdoor.com wrote:
 Our Solr master server protects access to itself by requiring that the 
 clients provide a signed SSL client cert from the same CA as the Solr server 
 itself. This is all handled within an Nginx reverse-proxy thats on the Solr 
 server itself.

 This works great for clients... not so great for replication. We want to do 
 replication access control the same way... but I have no idea how to get 
 Tomcat/Solr to use a particular keypair when making outbound HTTPS requests 
 to https://master/solr/replication. Any ideas?




-- 
Lance Norskog
goks...@gmail.com


Re: Wildcard queries on whole words

2012-06-27 Thread Jack Krupansky
I would understand if you had said that Klostermeyer* returned nothing 
because the presence of the wildcard used to suppress analysis, including 
the lower case filter so that the capital K term would never match an 
indexed term. But, I would have expected klostermeyer* to match 
klostermeyer  since the unanalyzed wildcard prefix would have the same 
term value as klostermeyer when it is indexed. So, this is a mystery to 
me.


-- Jack Krupansky

-Original Message- 
From: Klostermeyer, Michael

Sent: Wednesday, June 27, 2012 11:14 AM
To: solr-user@lucene.apache.org
Subject: Wildcard queries on whole words

I am researching an issue w/ wildcard searches on complete words in 3.5. 
For example, searching for kloster* returns klostermeyer, but 
klostermeyer* returns nothing.


The field being queried has the following analysis chain (standard 
'text_general'):


fieldType name=text_general class=solr.TextField 
positionIncrementGap=100

 analyzer type=index
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /

   filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type=query
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
   filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/

   filter class=solr.LowerCaseFilterFactory/
 /analyzer
/fieldType

I see that wildcard queries are not analyzed at query time, which could be 
the source of my issue, but I read conflicting advice on the interwebs.  I 
read also that this might have changed in 3.6, but I am unable to determine 
if my specific issue is addressed.


My questions:

1.   Why am I getting these search results with my current config?

2.   How do I fix it in 3.5?  Would upgrading to 3.6 also fix my 
issue?


Thanks!

Mike Klostermeyer



Re: what is precisionStep and positionIncrementGap

2012-06-27 Thread Li Li
1. precisionStep is used for ranging query of Numeric Fields. see
http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/search/NumericRangeQuery.html
2. positionIncrementGap is used for phrase query of multi-value fields
e.g. doc1 has two titles.
   title1: ab cd
   title2: xy zz
   if your positionIncrementGap is 0, then the position of the 4 terms
are 0,1,2,3.
   if you search phrase cd xy, it will hit. But you may think it
should not match
   so you can adjust positionIncrementGap to a larger one. e.g. 100.
   Then the positions now are 0,1,100,101. the phrase query will not match it.

On Thu, Jun 28, 2012 at 10:00 AM, ZHANG Liang F
liang.f.zh...@alcatel-sbell.com.cn wrote:
 Hi,
 in the schema.xml, usually there will be fieldType definition like this: 
 fieldType name=int class=solr.TrieIntField precisionStep=0 
 omitNorms=true positionIncrementGap=0/

 the precisionStep and positionIncrementGap is not very clear to me. Could you 
 please elaborate more on these 2?

 Thanks!

 Liang


Re: How to index and search string which contains double quotes ()

2012-06-27 Thread Jack Krupansky
The quotes are probably indexed correctly. You need to escape the quotes in 
your query:


Hi how r u \Test\

-- Jack Krupansky

-Original Message- 
From: ravicv

Sent: Wednesday, June 27, 2012 8:50 AM
To: solr-user@lucene.apache.org
Subject: How to index and search string which contains double quotes ()

Hi,

*My input string is *: Hi how r u Test
I need to index this input text with double quotes. but solr is removing
double quotes while indexing .

I am using *string *as the data type

if test is searched then i am able to get result as Hi how r u Test (without
double quotes)

How to get search result as the input string as  Hi how r u Test when i
searched for test

Thanks
Ravi

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-index-and-search-string-which-contains-double-quotes-tp3991586.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Query Logic Question

2012-06-27 Thread Li Li
I think they are logically the same. but 1 may be a little bit faster than 2

On Thu, Jun 28, 2012 at 5:59 AM, Rublex ruble...@hotmail.com wrote:
 Hi,

 Can someone explain to me please why these two queries return different
 results:

 1. -PaymentType:Finance AND -PaymentType:Lease AND -PaymentType:Cash *(700
 results)*

 2. (-PaymentType:Finance AND -PaymentType:Lease) AND -PaymentType:Cash *(0
 results)*

 Logically the two above queries should be return the same results no?

 Thank you

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Query-Logic-Question-tp3991689.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Logic Question

2012-06-27 Thread Jack Krupansky
It should work properly with the edismax query parser. The traditional 
lucene query parser is not smart enough about the fact that the Lucene 
BooleanQuery can't properly handle queries with only negative clauses.


Put *:* in front of all your negative terms and you will get similar 
results. edismax does that automatically.


(*:* -PaymentType:Finance AND *:* -PaymentType:Lease) AND 
*:* -PaymentType:Cash


-- Jack Krupansky

-Original Message- 
From: Rublex

Sent: Wednesday, June 27, 2012 4:59 PM
To: solr-user@lucene.apache.org
Subject: Query Logic Question

Hi,

Can someone explain to me please why these two queries return different
results:

1. -PaymentType:Finance AND -PaymentType:Lease AND -PaymentType:Cash *(700
results)*

2. (-PaymentType:Finance AND -PaymentType:Lease) AND -PaymentType:Cash *(0
results)*

Logically the two above queries should be return the same results no?

Thank you

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Logic-Question-tp3991689.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: question about DIH

2012-06-27 Thread wangjing
数据库的表有timestamp字段? 每次 进行更新和修改的时候,这个字段的值都会自动变化,做增量的时候就能根据这个处理了

On Tue, Jun 19, 2012 at 6:24 PM, alex.wang wang_...@sohu.com wrote:
 hi all:
    when i import the data from db to solr. and solr changed the value with
 timezone.
  eg, the original value is 16/02/2012 12:05:16 , changed to 1/02/2012
 04:05:06
 . and i add the 8 hours in my sql . it's be correct.
  but when i use delta-import mode to add index. it's not working.
 the sql in config as follow:
  entity name=sharing pk=id
            query=select t.id, t.content , t.money , t.tradeTgtId , t0.name
 , DATE_ADD(t.time,INTERVAL 8 HOUR) as time from sharing t LEFT JOIN tradetgt
 t0 on t.tradeTgtId = t0.id 
            deltaImportQuery=select t.id, t.content , t.money ,
 t.tradeTgtId , t0.name , t.time from sharing t LEFT JOIN tradetgt t0 on
 t.tradeTgtId = t0.id where t.id = '${dataimporter.delta.id}' 
            deltaQuery=select id from sharing where DATE_ADD(time,INTERVAL
 8 HOUR)  '${dataimporter.last_index_time}' 
            field column=id name=id/
                        field column=content name=sharing_content/
                        field column=time name=sharing_time/
                                                field column=money 
 name=sharing_money/
                                                field column=name 
 name=trade_name/
    /entity

 and no log error.
 i don't know why?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/question-about-DIH-tp3990261.html
 Sent from the Solr - User mailing list archive at Nabble.com.