date:20121205

Re[2]: Cannot run Solr4 from Intellij Idea

2012-12-05 Thread Artyom

See the screenshots:

solr_idea1: adding an IDEA tomcat artifact
solr_idea2: adding an IDEA facet
solr_idea3: placing modules into the artifact (drag modules from the "Available 
Elements" to ) and the created facet


Среда,  5 декабря 2012, 7:28  от "sarowe [via Lucene]" 
:
>   
>
>


>



>

Hi Artyom,
>
>I don't use IntelliJ artifacts - I just edit/compile/test.
>
>I can include this stuff in the IntelliJ configuration if you'll help me.  Can 
>you share screenshots of what you're talking about, and/or IntelliJ config 
>files?
>
>Steve
>
>On Dec 5, 2012, at 8:24 AM, Artyom <[hidden email]> wrote:
>
>
>> InelliJ IDEA is not so intelligent with Solr: to fix this problem I've
>> dragged these modules into the IDEA's artifact (parent module is wrong):
>> 
>> analysis-common
>> analysis-extras
>> analysis-uima
>> clustering
>> codecs
>> codecs-resources
>> dataimporthandler
>> dataimporthandler-extras
>> lucene-core
>> lucene-core-resources
>> solr-core
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Cannot-run-Solr4-from-Intellij-Idea-tp4024233p4024452.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>





>

>

>>--
>   
>

>If you reply to this email, your message will be added to the discussion below:

http://lucene.472066.n3.nabble.com/Cannot-run-Solr4-from-Intellij-Idea-tp4024233p4024484.html


>

To unsubscribe from Cannot run Solr4 from Intellij Idea, click 
here.
>
NAML

>   





>



=?UTF-8?B?c29scl9pZGVhMS5wbmc=?= (102K) 

=?UTF-8?B?c29scl9pZGVhMi5wbmc=?= (117K) 

=?UTF-8?B?c29scl9pZGVhMy5wbmc=?= (148K) 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-run-Solr4-from-Intellij-Idea-tp4024233p4024723.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR4 cluster - strange CPU spike on slave

2012-12-05 Thread John Nielsen

I'm not sure I understand why this is important. Too much memory would just
be unused.

This is what the heap looks now:

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize  = 17179869184 (16384.0MB)
   NewSize  = 21757952 (20.75MB)
   MaxNewSize   = 283508736 (270.375MB)
   OldSize  = 65404928 (62.375MB)
   NewRatio = 7
   SurvivorRatio= 8
   PermSize = 21757952 (20.75MB)
   MaxPermSize  = 176160768 (168.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 255197184 (243.375MB)
   used = 108828496 (103.78694152832031MB)
   free = 146368688 (139.5880584716797MB)
   42.644865548359654% used
Eden Space:
   capacity = 226885632 (216.375MB)
   used = 83498424 (79.63030242919922MB)
   free = 143387208 (136.74469757080078MB)
   36.80198841326365% used
>From Space:
   capacity = 28311552 (27.0MB)
   used = 25330072 (24.156639099121094MB)
   free = 2981480 (2.8433609008789062MB)
   89.46903370044849% used
To Space:
   capacity = 28311552 (27.0MB)
   used = 0 (0.0MB)
   free = 28311552 (27.0MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 16896360448 (16113.625MB)
   used = 12452710200 (11875.829887390137MB)
   free = 4443650248 (4237.795112609863MB)
   73.70054775005708% used
Perm Generation:
   capacity = 70578176 (67.30859375MB)
   used = 37652032 (35.90777587890625MB)
   free = 32926144 (31.40081787109375MB)
   53.347981109627995% used



Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk



On Thu, Nov 29, 2012 at 4:08 AM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> If this is caused by index segment merging you should be able to see that
> very clearly on the Index report in SPM, where you would see sudden graph
> movement at the time of spike and corresponding to CPU and disk activity.
> I think uncommenting that infostream in solrconfig would also show it.
>
> Otis
> --
> SOLR Performance Monitoring - http://sematext.com/spm
> On Nov 28, 2012 9:20 PM, "Erick Erickson"  wrote:
>
> > Am I reading this right? All you're doing on varnish1 is replicating to
> it?
> > You're not searching or indexing? I'm sure I'm misreading this.
> >
> >
> > "The spike, which only lasts for a couple of minutes, sends the disks
> > racing" This _sounds_ suspiciously like segment merging, especially the
> > "disks racing" bit. Or possibly replication. Neither of which make much
> > sense. But is there any chance that somehow multiple commits are being
> > issued? Of course if varnish1 is a slave, that shouldn't be happening
> > either.
> >
> > And the whole bit about nothing going to the logs is just bizarre. I'm
> > tempted to claim hardware gremlins, especially if you see nothing similar
> > on varnish2. Or some other process is pegging the machine. All of which
> is
> > a way of saying "I have no idea"
> >
> > Yours in bewilderment,
> > Erick
> >
> >
> >
> > On Wed, Nov 28, 2012 at 6:15 AM, John Nielsen  wrote:
> >
> > > I apologize for the late reply.
> > >
> > > The query load is more or less stable during the spikes. There are
> always
> > > fluctuations, but nothing on the order of magnitude that could explain
> > this
> > > spike. In fact, the latest spike occured last night when there were
> > almost
> > > noone using it.
> > >
> > > To test a hunch of mine, I tried to deactivate all caches by commenting
> > out
> > > all cache entries in solrconfig.xml. It still spikes, so I dont think
> it
> > > has anything to do with cache warming or hits/misses or anything of the
> > > sort.
> > >
> > > One interesting thing GC though. This is our latest spike with cpu load
> > > (this server has 8 cores, so a load higher than 8 is potentially
> > > troublesome):
> > >
> > > 2012.Nov.27 19:58:182.27
> > > 2012.Nov.27 19:57:174.06
> > > 2012.Nov.27 19:56:188.95
> > > 2012.Nov.27 19:55:1719.97
> > > 2012.Nov.27 19:54:1732.27
> > > 2012.Nov.27 19:53:181.67
> > > 2012.Nov.27 19:52:171.6
> > > 2012.Nov.27 19:51:181.77
> > > 2012.Nov.27 19:50:171.89
> > >
> > > This is what the GC was doing around that time:
> > >
> > > 2012-11-27T19:50:04.933+0100: [GC [PSYoungGen:
> > 4777586K->277641K(4969216K)]
> > > 8887542K->4387693K(9405824K), 0.0856360 secs] [Times: user=0.54
> sys=0.00,
> > > real=0.09 secs]
> > > 2012-11-27T19:50:30.785+0100: [GC [PSYoungGen:
> > 4749769K->325171K(5068096K)]
> > > 8859821K->4435320K(9504704K), 0.0992160 secs] [Times: user=0.63
> sys=0.00,
> > > real=0.10 secs]
> > > 2012-11-27T19:51:12.293+0100: [GC [PSYoungGen:
> > 4911603K->306181K(5071168K)]
> > > 9021752K->4416617K(9507776K), 0.0957890 secs] [Times: user=0.62
> sys=0.00,
> > > real=0.09 secs]
> > > 2012-11-27T19:51:52.817+0100: [GC [PSYoungGen:
> > 4892613K->376175K(5075328K)]
> > > 9003049K->4486755K(9511936K), 0.1099830 secs] [Times: u

Re: SOLR4 cluster - strange CPU spike on slave

2012-12-05 Thread John Nielsen

Very interesting!

I've seen references to NRTCachingDirectory, MMapDirectory, FSDirectory,
RamDirectory and NIOFSDirectory, and thats just what I can remember. I have
tried to search for more information about these, but I'm not having much
luck.

Is there a place where I can read up on these?

Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk



On Wed, Dec 5, 2012 at 1:11 AM, Mark Miller  wrote:

>
> On Dec 4, 2012, at 2:25 AM, John Nielsen  wrote:
>
> > The post about mmapdirectory is really interesting. We switched to using
> > that from NRTCachingDirectory and am monitoring performance as well.
> > Initially performance doesn't look stellar, but i suspect that we lack
> > memory in the server to really make it shine.
>
> NRTCachingDirectory delegates to another directory and simply caches small
> segments in RAM - usually it delegates MMapDirectory by default. So likely
> you won't notice any changes, because you have not likely really changed
> anything. NRTCachingDirectory simply helps in the NRT case and doesn't
> really hurt that I've seen in the std case. It's more like a help dir than
> a replacement.
>
> - Mark

Re: Concern with using external SQL server for DIH

2012-12-05 Thread Shawn Heisey


On 12/5/2012 9:42 AM, Spadez wrote:

I am looking to import entries to my SOLR server by using the DIH,
connecting to an external postgre SQL server using the JDBC driver. I will
be importing about 50,000 entries each time.

Is connecting to an external SQL server for my data unreliable or risky, or
is it instead perfrectly reasonable?

My alternative is to export the SQL file on the other server, download the
SQL file to my SOLR server, import it to my Solr servers copy of postgreSQL
and then run the DIH on the local database.


I use DIH in situations that require a full reindex.  The MySQL database 
has 78 million records and imports simultaneously to seven Solr shards 
on two servers.  It takes about three hours.


The only instability that we ever noticed was on older Solr versions 
(1.4.x) with a low mergeFactor.  We ran into a situation where Solr was 
doing a lot of simultaneous merges and stopped indexing data long enough 
that the JDBC connection timed out.  We increased our mergeFactor, and 
newer Solr versions have better configuration possibilities, so now we 
have more merging threads.


Thanks,
Shawn

Re: Incremental Update of index

2012-12-05 Thread Gora Mohanty

On 6 December 2012 11:13, Amit Jha  wrote:
> Thanks Sandeep,
>
> How can it done when using a database because database has all the records
> old, new and updated.

You need to do a delta-import:
http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command

Regards,
Gora

Re: Incremental Update of index

2012-12-05 Thread Amit Jha

Thanks Sandeep,

How can it done when using a database because database has all the records
old, new and updated.

On Wed, Dec 5, 2012 at 11:47 PM, Sandeep Mestry  wrote:

> Hi Amit/Shanu,
>
> You can create the solr document for only the updated record and index it
> to ensure only the updated record gets indexed.
> You need not rebuild indexes from scratch for every record update.
>
> Thanks,
> Sandeep
>

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Sudhakar Maddineni

Yep, after restarting, cluster came back to normal state.We will run couple
of more tests and see if we could reproduce this issue.

Btw, I am attaching the server logs before that 'INFO: *Waiting until we
see more replicas*'  message.From the logs, we can see that leader election
process started on 003 which was the replica for 001 initially.That means
leader 001 went down at that time?

logs on 003:

12:11:16 PM org.apache.solr.cloud.ShardLeaderElectionContext
runLeaderProcess
INFO: Running the leader process.
12:11:16 PM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader
INFO: Checking if I should try and be the leader.
12:11:16 PM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader
INFO: My last published State was Active, it's okay to be the
leader.
12:11:16 PM org.apache.solr.cloud.ShardLeaderElectionContext
runLeaderProcess
INFO: I may be the new leader - try and sync
12:11:16 PM org.apache.solr.cloud.RecoveryStrategy close
WARNING: Stopping recovery for zkNodeName=<003>:8080_solr_core
core=core1.
12:11:16 PM org.apache.solr.cloud.SyncStrategy sync
INFO: Sync replicas to http://<003>:8080/solr/core1/
12:11:16 PM org.apache.solr.update.PeerSync sync
INFO: PeerSync: core=core1 url=http://<003>:8080/solr START
replicas=[<001>:8080/solr/core1/] nUpdates=100
12:11:16 PM org.apache.solr.common.cloud.ZkStateReader$3 process
INFO: Updating live nodes -> this message is on 002
12:11:46 PM org.apache.solr.update.PeerSync handleResponse
WARNING: PeerSync: core=core1 url=http://<003>:8080/solr  exception
talking to <001>:8080/solr/core1/, failed
org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at: <001>:8080/solr/core1
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:409)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
12:11:46 PM org.apache.solr.update.PeerSync sync
INFO: PeerSync: core=core1 url=http://<003>:8080/solr DONE. sync
failed
12:11:46 PM org.apache.solr.common.SolrException log
SEVERE: Sync Failed
12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
rejoinLeaderElection
INFO: There is a better leader candidate than us - going back into
recovery
12:11:46 PM org.apache.solr.update.DefaultSolrCoreState doRecovery
INFO: Running recovery - first canceling any ongoing recovery
12:11:46 PM org.apache.solr.cloud.RecoveryStrategy run
INFO: Starting recovery process.  core=core1
recoveringAfterStartup=false
12:11:46 PM org.apache.solr.cloud.RecoveryStrategy doRecovery
INFO: Attempting to PeerSync from <001>:8080/solr/core1/ core=core1
- recoveringAfterStartup=false
12:11:46 PM org.apache.solr.update.PeerSync sync
INFO: PeerSync: core=core1 url=http://<003>:8080/solr START
replicas=[<001>:8080/solr/core1/] nUpdates=100
12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
runLeaderProcess
INFO: Running the leader process.
12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: *Waiting until we see more replicas up: total=2 found=1
timeoutin=17*
12:11:47 PM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: *Waiting until we see more replicas up: total=2 found=1
timeoutin=179495*
12:11:48 PM org.apache.solr.cloud.ShardLeaderElectionContext
waitForReplicasToComeUp
INFO: *Waiting until we see more replicas up: total=2 found=1
timeoutin=178985*



Thanks for your help.
Sudhakar.

On Wed, Dec 5, 2012 at 6:19 PM, Mark Miller  wrote:

> The waiting logging had to happen on restart unless it's some kind of bug.
>
> Beyond that, something is off, but I have no clue why - it seems your
> clusterstate.json is not up to date at all.
>
> Have you tried restarting the cluster then? Does that help at all?
>
> Do you see any exceptions around zookeeper session timeouts?
>
> - Mark

RE: SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread Michael Ryan

As you add nodes, the average response time of the slowest node will likely 
increase. For example, consider an extreme case where you have something like 1 
million nodes - you're practically guaranteed that one of them is going to be 
doing something like a stop-the-world garbage collection. So even if 999,999 
return in 10ms, there's going to be that one slowpoke that takes 1000ms, and it 
doesn't matter how fast the other 999,999 are.

-Michael

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Wednesday, December 05, 2012 11:00 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud - Query performance degrades with multiple servers

This is just the std scatter gather distrib search stuff solr has been using 
since around 1.4.

There is some overhead to that, but generally not much. I've measured it at 
around 30-50ms for a 100 machines, each with 10 million docs a few years ago.

So...that doesn't help you much...but FYI...

- Mark

On Dec 5, 2012, at 5:35 PM, sausarkar  wrote:

> We are using SolrCloud and trying to configure it for testing purposes, we
> are seeing that the average query time is increasing if we have more than
> one node in the SolrCloud cluster. We have a single shard 12 gigs
> index.Example:1 node, average query time *~28 msec* , load 140
> queries/second3 nodes, average query time *~110 msec*, load 420
> queries/second distributed equally on three servers so essentially 140 qps
> on each node.Is there any inter node communication going on for queries, is
> there any setting on the Solrcloud for query tuning for a  cloud config with
> multiple nodes.Please help.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Concern with using external SQL server for DIH

2012-12-05 Thread Gora Mohanty

On 5 December 2012 22:12, Spadez  wrote:
> Hi,
>
> I am looking to import entries to my SOLR server by using the DIH,
> connecting to an external postgre SQL server using the JDBC driver. I will
> be importing about 50,000 entries each time.

Unless you have a lot of data in each entry, importing 50,000
entries should be pretty trivial.

> Is connecting to an external SQL server for my data unreliable or risky, or
> is it instead perfrectly reasonable?
[...]

Reliability is largely a matter of network connectivity, load on the
database and Solr servers, and to a lesser extent, on the JDBC
driver used (for SQL server, jTDS seemed significantly better to
us). Again, for 50,000 entries, these should typically not be of
concern.

"Risky" in what sense? Are the data sensitive?

Regards,
Gora

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Mark Miller

Looks like the connecting to ZooKeeper is "flapping". So as it tries to recover 
it keeps losing the connection to zookeeper and then trying again, and I don't 
have enough of the log to tell, but that probably just repeats and repeats.

I guess the network is probably not so fast and or the load and network are not 
working out so well…

Try raising the zkClientTimeout - see the FAQ: 
http://wiki.apache.org/solr/SolrCloud#FAQ

It defaults to like 15 seconds. You might try 30 or 40 seconds. If that doesn't 
help, you have to figure out why Solr is having so much trouble communicating 
with ZooKeeper.

Are you using an external ensemble, a single ZooKeeper node, embedded ZooKeeper?

- Mark

On Dec 5, 2012, at 7:59 PM, Nathaniel Domingo  wrote:

> here's a link to a portion of the log in pastebin.
> 
> http://pastebin.com/UDBMDdMv
> 
> Thanks
> 
> 
> On Thu, Dec 6, 2012 at 11:53 AM, Mark Miller  wrote:
> 
>> I think the list strips most attachments or something - can you try
>> something like pastebin.com?
>> 
>> Thanks,
>> 
>> mark
>> 
>> On Dec 5, 2012, at 7:46 PM, Nathaniel Domingo 
>> wrote:
>> 
>>> attached is a log relevant to the recovery.
>>> 
>>> Thanks
>>> 
>>> 
>>> On Thu, Dec 6, 2012 at 11:23 AM, Mark Miller 
>> wrote:
>>> Okay - logs from that node would help a lot then (or just the parts
>> around when it's trying to recover).
>>> 
>>> - Mark
>>> 
>>> On Dec 5, 2012, at 7:11 PM, Nathaniel Domingo 
>> wrote:
>>> 
 yes, i tried restarting the node twice already and both times it just
>> got
 stucked in recovering. one node also had some problems a few days ago,
 after a restart, it eventually moved from recovering to active after an
 hour. i'm using solr 4.0.0.
 
 Thanks
 
 
 On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller 
>> wrote:
 
> Did you try restarting that node? Have you seen a successful recovery
> before? What exact version are you using?
> 
> Can you share any related info in the logs for that node?
> 
> - Mark
> 
> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo >> 
> wrote:
> 
>> Hi,
>> 
>> I'm very new to solr, less than a month. I just set-up a solrcloud
> cluster
>> last week and has encountered a problem. i have four nodes with two
> shards.
>> one node is stuck at recovering for days now. how do i go about
>> fixing
> this?
>> 
>> Thanks
> 
> 
>>> 
>>> 
>> 
>>

Re: SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread Mark Miller

This is just the std scatter gather distrib search stuff solr has been using 
since around 1.4.

There is some overhead to that, but generally not much. I've measured it at 
around 30-50ms for a 100 machines, each with 10 million docs a few years ago.

So…that doesn't help you much…but FYI…

- Mark

On Dec 5, 2012, at 5:35 PM, sausarkar  wrote:

> We are using SolrCloud and trying to configure it for testing purposes, we
> are seeing that the average query time is increasing if we have more than
> one node in the SolrCloud cluster. We have a single shard 12 gigs
> index.Example:1 node, average query time *~28 msec* , load 140
> queries/second3 nodes, average query time *~110 msec*, load 420
> queries/second distributed equally on three servers so essentially 140 qps
> on each node.Is there any inter node communication going on for queries, is
> there any setting on the Solrcloud for query tuning for a  cloud config with
> multiple nodes.Please help.
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Occasional "failed to respond" errors

2012-12-05 Thread Michael Ryan

We have a longstanding issue with "failed to respond" errors in Solr when our 
coordinator is querying our Solr shards.

To elaborate further...  we're using the built-in distributed capabilities of 
Solr 3.6, and using Jetty as our server.  Occasionally, we will have a query 
fail due to an error like 
"org.apache.commons.httpclient.NoHttpResponseException: The server 
solr-shard-13 failed to respond" when the Solr coordinator is sending a request 
to one of its shards.  Over the long term, this happens for about 1 out of 3000 
queries.  The quick fix of simply retrying the query when such an intermittent 
error occurs works fine, but I'm trying to figure out what the root cause might 
be.

I've got lots of theories and possible fixes, but was hoping someone had run 
into this before and knows the answer straight away :)

-Michael

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Nathaniel Domingo

here's a link to a portion of the log in pastebin.

http://pastebin.com/UDBMDdMv

Thanks


On Thu, Dec 6, 2012 at 11:53 AM, Mark Miller  wrote:

> I think the list strips most attachments or something - can you try
> something like pastebin.com?
>
> Thanks,
>
> mark
>
> On Dec 5, 2012, at 7:46 PM, Nathaniel Domingo 
> wrote:
>
> > attached is a log relevant to the recovery.
> >
> > Thanks
> >
> >
> > On Thu, Dec 6, 2012 at 11:23 AM, Mark Miller 
> wrote:
> > Okay - logs from that node would help a lot then (or just the parts
> around when it's trying to recover).
> >
> > - Mark
> >
> > On Dec 5, 2012, at 7:11 PM, Nathaniel Domingo 
> wrote:
> >
> > > yes, i tried restarting the node twice already and both times it just
> got
> > > stucked in recovering. one node also had some problems a few days ago,
> > > after a restart, it eventually moved from recovering to active after an
> > > hour. i'm using solr 4.0.0.
> > >
> > > Thanks
> > >
> > >
> > > On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller 
> wrote:
> > >
> > >> Did you try restarting that node? Have you seen a successful recovery
> > >> before? What exact version are you using?
> > >>
> > >> Can you share any related info in the logs for that node?
> > >>
> > >> - Mark
> > >>
> > >> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo  >
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> I'm very new to solr, less than a month. I just set-up a solrcloud
> > >> cluster
> > >>> last week and has encountered a problem. i have four nodes with two
> > >> shards.
> > >>> one node is stuck at recovering for days now. how do i go about
> fixing
> > >> this?
> > >>>
> > >>> Thanks
> > >>
> > >>
> >
> >
>
>

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Mark Miller

I think the list strips most attachments or something - can you try something 
like pastebin.com?

Thanks,

mark

On Dec 5, 2012, at 7:46 PM, Nathaniel Domingo  wrote:

> attached is a log relevant to the recovery.
> 
> Thanks
> 
> 
> On Thu, Dec 6, 2012 at 11:23 AM, Mark Miller  wrote:
> Okay - logs from that node would help a lot then (or just the parts around 
> when it's trying to recover).
> 
> - Mark
> 
> On Dec 5, 2012, at 7:11 PM, Nathaniel Domingo  wrote:
> 
> > yes, i tried restarting the node twice already and both times it just got
> > stucked in recovering. one node also had some problems a few days ago,
> > after a restart, it eventually moved from recovering to active after an
> > hour. i'm using solr 4.0.0.
> >
> > Thanks
> >
> >
> > On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller  wrote:
> >
> >> Did you try restarting that node? Have you seen a successful recovery
> >> before? What exact version are you using?
> >>
> >> Can you share any related info in the logs for that node?
> >>
> >> - Mark
> >>
> >> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I'm very new to solr, less than a month. I just set-up a solrcloud
> >> cluster
> >>> last week and has encountered a problem. i have four nodes with two
> >> shards.
> >>> one node is stuck at recovering for days now. how do i go about fixing
> >> this?
> >>>
> >>> Thanks
> >>
> >>
> 
>

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Nathaniel Domingo

attached is a log relevant to the recovery.

Thanks


On Thu, Dec 6, 2012 at 11:23 AM, Mark Miller  wrote:

> Okay - logs from that node would help a lot then (or just the parts around
> when it's trying to recover).
>
> - Mark
>
> On Dec 5, 2012, at 7:11 PM, Nathaniel Domingo 
> wrote:
>
> > yes, i tried restarting the node twice already and both times it just got
> > stucked in recovering. one node also had some problems a few days ago,
> > after a restart, it eventually moved from recovering to active after an
> > hour. i'm using solr 4.0.0.
> >
> > Thanks
> >
> >
> > On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller 
> wrote:
> >
> >> Did you try restarting that node? Have you seen a successful recovery
> >> before? What exact version are you using?
> >>
> >> Can you share any related info in the logs for that node?
> >>
> >> - Mark
> >>
> >> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> I'm very new to solr, less than a month. I just set-up a solrcloud
> >> cluster
> >>> last week and has encountered a problem. i have four nodes with two
> >> shards.
> >>> one node is stuck at recovering for days now. how do i go about fixing
> >> this?
> >>>
> >>> Thanks
> >>
> >>
>
>

Re: Reloading config to zookeeper

2012-12-05 Thread Mark Miller

Right, solrhome is not required for upconfig, just for the bootstrap cmd.

You can also just upload modified files, but the tool doesn't really
let you do it in a fine grained way. But there are lots of zookeeper
tools you can use to do this if you wanted.

- Mark

On Thu, Nov 22, 2012 at 10:45 AM, Marcin Rzewucki  wrote:
> I think solrhome is not mandatory.
> Yes, reloading is uploading config dir again. It's a pity we can't update
> just modified files.
> Regards.
>
> On 22 November 2012 19:38, Cool Techi  wrote:
>
>> Thanks, but why do we need to specify the -solrhome?
>>
>> I am using the following command to load new config,
>>
>> java -classpath .:/Users/solr-cli-lib/* org.apache.solr.cloud.ZkCLI -cmd
>> upconfig -zkhost
>> localhost:2181,localhost:2182,localhost:2183,localhost:2184,localhost:2185
>> -confdir /Users/config-files -confname myconf
>>
>> So basically reloading is just uploading the configs back again?
>>
>> Regard,s
>> Ayush
>>
>> > Date: Thu, 22 Nov 2012 19:32:27 +0100
>> > Subject: Re: Reloading config to zookeeper
>> > From: mrzewu...@gmail.com
>> > To: solr-user@lucene.apache.org
>> >
>> > Hi,
>> >
>> > I'm using "cloud-scripts/zkcli.sh" script for reloading configuration,
>> for
>> > example:
>> > $ ./cloud-scripts/zkcli.sh -cmd upconfig -confdir  -solrhome
>> >  -confname  -z 
>> >
>> > Then I'm reloading collection on each node in cloud, but maybe someone
>> > knows better solution.
>> > Regards.
>> >
>> > On 22 November 2012 19:23, Cool Techi  wrote:
>> >
>> > > When we make changes to our config files, how do we reload the files
>> into
>> > > zookeeper.
>> > >
>> > > Also, I understand that we would need to reload the collection, would
>> we
>> > > need to do this at a per shard level or just at the cloud level.
>> > >
>> > > Regards,
>> > > Ayush
>> > >
>> > >
>>
>>



-- 
- Mark

Re: Creating a collection without bootstrap

2012-12-05 Thread Mark Miller

Yup, it should say config set, not collections.

A config set is all the config files for a single collection -
schema.xml, solrconfig.xml and related config files. You can name the
set of them so that multiple collections can use the same 'config
set'.

If you don't use a bootstrap option, upconfig is first - that gets the
conf files into zookeeper. As part of that, you give the name of the
'conf set' you are uploading.

If you could help make any of this clearer on the wiki, please do!

So, if you are not predefining your collection in solr.xml on each
node (like collection1 is by default):

Upload the conf files with upconfig, then create your collection with
the collection api, then link it to the conf set. Or you can link by
specifying the config set to use as part of the collection api call.

Linking can actually be done at any time with zkcli - even before you
create the collection - if you link ahead of time, once you create the
collection (using the same name you linked with), it will find the
link and use it.

You can also auto link by naming the conf set the same name as the collection.

- Mark

On Tue, Dec 4, 2012 at 3:10 PM, Walter Underwood  wrote:
> Here is one problem. On the SolrCloud wiki page, it says "link collection 
> sets to collections", but I'm pretty sure that should read "config set".
>
> Also "config set" (or "conf set") is never defined.
>
> wunder
>
> On Dec 4, 2012, at 11:07 AM, Walter Underwood wrote:
>
>> I seem to be missing a step or some kind of ordering in creating a new 
>> collection without using bootstrap upload. I have these steps:
>>
>> * zookeeper upconfig (pretty sure this is first)
>> * Collection API create collection
>> * zookeeper linkconfig
>>
>> I'm working from this page: http://wiki.apache.org/solr/SolrCloud
>>
>> A step-by-step recipe would be really nice.
>>
>> wunder
>> --
>> Walter Underwood
>> wun...@wunderwood.org
>> Search Guy, Chegg.com
>
>
>

-- 
- Mark

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Mark Miller

Okay - logs from that node would help a lot then (or just the parts around when 
it's trying to recover).

- Mark

On Dec 5, 2012, at 7:11 PM, Nathaniel Domingo  wrote:

> yes, i tried restarting the node twice already and both times it just got
> stucked in recovering. one node also had some problems a few days ago,
> after a restart, it eventually moved from recovering to active after an
> hour. i'm using solr 4.0.0.
> 
> Thanks
> 
> 
> On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller  wrote:
> 
>> Did you try restarting that node? Have you seen a successful recovery
>> before? What exact version are you using?
>> 
>> Can you share any related info in the logs for that node?
>> 
>> - Mark
>> 
>> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo 
>> wrote:
>> 
>>> Hi,
>>> 
>>> I'm very new to solr, less than a month. I just set-up a solrcloud
>> cluster
>>> last week and has encountered a problem. i have four nodes with two
>> shards.
>>> one node is stuck at recovering for days now. how do i go about fixing
>> this?
>>> 
>>> Thanks
>> 
>>

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Nathaniel Domingo

yes, i tried restarting the node twice already and both times it just got
stucked in recovering. one node also had some problems a few days ago,
after a restart, it eventually moved from recovering to active after an
hour. i'm using solr 4.0.0.

Thanks

On Thu, Dec 6, 2012 at 11:03 AM, Mark Miller  wrote:

> Did you try restarting that node? Have you seen a successful recovery
> before? What exact version are you using?
>
> Can you share any related info in the logs for that node?
>
> - Mark
>
> On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo 
> wrote:
>
> > Hi,
> >
> > I'm very new to solr, less than a month. I just set-up a solrcloud
> cluster
> > last week and has encountered a problem. i have four nodes with two
> shards.
> > one node is stuck at recovering for days now. how do i go about fixing
> this?
> >
> > Thanks
>
>

Re: solr cloud node stuck at recovering for days

2012-12-05 Thread Mark Miller

Did you try restarting that node? Have you seen a successful recovery before? 
What exact version are you using?

Can you share any related info in the logs for that node?

- Mark

On Dec 5, 2012, at 6:48 PM, Nathaniel Domingo  wrote:

> Hi,
> 
> I'm very new to solr, less than a month. I just set-up a solrcloud cluster
> last week and has encountered a problem. i have four nodes with two shards.
> one node is stuck at recovering for days now. how do i go about fixing this?
> 
> Thanks

SolrCloud - Query performance degrades with multiple servers

2012-12-05 Thread sausarkar

We are using SolrCloud and trying to configure it for testing purposes, we
are seeing that the average query time is increasing if we have more than
one node in the SolrCloud cluster. We have a single shard 12 gigs
index.Example:1 node, average query time *~28 msec* , load 140
queries/second3 nodes, average query time *~110 msec*, load 420
queries/second distributed equally on three servers so essentially 140 qps
on each node.Is there any inter node communication going on for queries, is
there any setting on the Solrcloud for query tuning for a  cloud config with
multiple nodes.Please help.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-performance-degrades-with-multiple-servers-tp4024660.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Mark Miller

The waiting logging had to happen on restart unless it's some kind of bug.

Beyond that, something is off, but I have no clue why - it seems your 
clusterstate.json is not up to date at all.

Have you tried restarting the cluster then? Does that help at all?

Do you see any exceptions around zookeeper session timeouts?

- Mark

On Dec 5, 2012, at 4:57 PM, Sudhakar Maddineni  wrote:

> Hey Mark,
> 
> Yes, I am able to access all of the nodes under each shard from solrcloud
> admin UI.
> 
> 
>   - *It kind of looks like the urls solrcloud is using are not accessible.
>   When you go to the admin page and the cloud tab, can you access the urls it
>   shows for each shard? That is, if you click on of the links or copy and
>   paste the address into a web browser, does it work?*
> 
> Actually, I got these errors when my document upload task/job was running,
> not during the cluster restart. Also,job ran fine initially for the first
> one hour and started throwing these errors after indexing some docx.
> 
> Thx, Sudhakar.
> 
> 
> 
> 
> On Wed, Dec 5, 2012 at 5:38 PM, Mark Miller  wrote:
> 
>> It kind of looks like the urls solrcloud is using are not accessible. When
>> you go to the admin page and the cloud tab, can you access the urls it
>> shows for each shard? That is, if you click on of the links or copy and
>> paste the address into a web browser, does it work?
>> 
>> You may have to explicitly set the host= in solr.xml if it's not auto
>> detecting the right one. Make sure the ports like right too.
>> 
>>> waitForReplicasToComeUp
>>> INFO: Waiting until we see more replicas up: total=2 found=1
>>> timeoutin=17
>> 
>> That happens when you stop the cluster and try to start it again - before
>> a leader is chosen, it will wait for all known replicas fora shard to come
>> up so that everyone can sync up and have a chance to be the best leader. So
>> at this point it was only finding one of 2 known replicas and waiting for
>> the second to come up. After a couple minutes (configurable) it will just
>> continue anyway without the missing replica (if it doesn't show up).
>> 
>> - Mark
>> 
>> On Dec 5, 2012, at 4:21 PM, Sudhakar Maddineni 
>> wrote:
>> 
>>> Hi,
>>> We are uploading solr documents to the index in batches using 30 threads
>>> and using ThreadPoolExecutor, LinkedBlockingQueue with max limit set to
>>> 1.
>>> In the code, we are using HttpSolrServer and add(inputDoc) method to add
>>> docx.
>>> And, we have the following commit settings in solrconfig:
>>> 
>>>
>>>  30
>>>  1
>>>  false
>>>
>>> 
>>>  
>>>1000
>>>  
>>> 
>>> Cluster Details:
>>> 
>>> solr version - 4.0
>>> zookeeper version - 3.4.3 [zookeeper ensemble with 3 nodes]
>>> numshards=2 ,
>>> 001, 002, 003 are the solr nodes and these three are behind the
>>> loadbalancer  
>>> 001, 003 assigned to shard1; 002 assigned to shard2
>>> 
>>> 
>>> Logs:Getting the errors in the below sequence after uploading some docx:
>>> 
>> ---
>>> 003
>>> Dec 4, 2012 12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
>>> waitForReplicasToComeUp
>>> INFO: Waiting until we see more replicas up: total=2 found=1
>>> timeoutin=17
>>> 
>>> 001
>>> Dec 4, 2012 12:12:59 PM
>>> org.apache.solr.update.processor.DistributedUpdateProcessor
>>> doDefensiveChecks
>>> SEVERE: ClusterState says we are the leader, but locally we don't think
>> so
>>> 
>>> 003
>>> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
>>> SEVERE: forwarding update to <001>:8080/solr/core1/ failed - retrying ...
>>> 
>>> 001
>>> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
>>> SEVERE: Error uploading: org.apache.solr.common.SolrException: Server at
>>> /solr/core1. returned non ok status:503, message:Service Unavailable
>>> at
>>> 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
>>> at
>>> 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>> 001
>>> Dec 4, 2012 12:25:45 PM org.apache.solr.common.SolrException log
>>> SEVERE: Error while trying to recover.
>>> core=core1:org.apache.solr.common.SolrException: We are not the leader
>>> at
>>> 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:401)
>>> 
>>> 001
>>> Dec 4, 2012 12:44:38 PM org.apache.solr.common.SolrException log
>>> SEVERE: Error uploading:
>> org.apache.solr.client.solrj.SolrServerException:
>>> IOException occured when talking to server at /solr/core1
>>> at
>>> 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
>>> at
>>> 
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>> at
>>> 
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>>> at org.apache.solr.client.solrj.SolrServer.add(

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Sudhakar Maddineni

Hey Mark,

Yes, I am able to access all of the nodes under each shard from solrcloud
admin UI.


   - *It kind of looks like the urls solrcloud is using are not accessible.
   When you go to the admin page and the cloud tab, can you access the urls it
   shows for each shard? That is, if you click on of the links or copy and
   paste the address into a web browser, does it work?*

Actually, I got these errors when my document upload task/job was running,
not during the cluster restart. Also,job ran fine initially for the first
one hour and started throwing these errors after indexing some docx.

Thx, Sudhakar.




On Wed, Dec 5, 2012 at 5:38 PM, Mark Miller  wrote:

> It kind of looks like the urls solrcloud is using are not accessible. When
> you go to the admin page and the cloud tab, can you access the urls it
> shows for each shard? That is, if you click on of the links or copy and
> paste the address into a web browser, does it work?
>
> You may have to explicitly set the host= in solr.xml if it's not auto
> detecting the right one. Make sure the ports like right too.
>
> > waitForReplicasToComeUp
> > INFO: Waiting until we see more replicas up: total=2 found=1
> > timeoutin=17
>
> That happens when you stop the cluster and try to start it again - before
> a leader is chosen, it will wait for all known replicas fora shard to come
> up so that everyone can sync up and have a chance to be the best leader. So
> at this point it was only finding one of 2 known replicas and waiting for
> the second to come up. After a couple minutes (configurable) it will just
> continue anyway without the missing replica (if it doesn't show up).
>
> - Mark
>
> On Dec 5, 2012, at 4:21 PM, Sudhakar Maddineni 
> wrote:
>
> > Hi,
> > We are uploading solr documents to the index in batches using 30 threads
> > and using ThreadPoolExecutor, LinkedBlockingQueue with max limit set to
> > 1.
> > In the code, we are using HttpSolrServer and add(inputDoc) method to add
> > docx.
> > And, we have the following commit settings in solrconfig:
> >
> > 
> >   30
> >   1
> >   false
> > 
> >
> >   
> > 1000
> >   
> >
> > Cluster Details:
> > 
> > solr version - 4.0
> > zookeeper version - 3.4.3 [zookeeper ensemble with 3 nodes]
> > numshards=2 ,
> > 001, 002, 003 are the solr nodes and these three are behind the
> > loadbalancer  
> > 001, 003 assigned to shard1; 002 assigned to shard2
> >
> >
> > Logs:Getting the errors in the below sequence after uploading some docx:
> >
> ---
> > 003
> > Dec 4, 2012 12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
> > waitForReplicasToComeUp
> > INFO: Waiting until we see more replicas up: total=2 found=1
> > timeoutin=17
> >
> > 001
> > Dec 4, 2012 12:12:59 PM
> > org.apache.solr.update.processor.DistributedUpdateProcessor
> > doDefensiveChecks
> > SEVERE: ClusterState says we are the leader, but locally we don't think
> so
> >
> > 003
> > Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> > SEVERE: forwarding update to <001>:8080/solr/core1/ failed - retrying ...
> >
> > 001
> > Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> > SEVERE: Error uploading: org.apache.solr.common.SolrException: Server at
> > /solr/core1. returned non ok status:503, message:Service Unavailable
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > 001
> > Dec 4, 2012 12:25:45 PM org.apache.solr.common.SolrException log
> > SEVERE: Error while trying to recover.
> > core=core1:org.apache.solr.common.SolrException: We are not the leader
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:401)
> >
> > 001
> > Dec 4, 2012 12:44:38 PM org.apache.solr.common.SolrException log
> > SEVERE: Error uploading:
> org.apache.solr.client.solrj.SolrServerException:
> > IOException occured when talking to server at /solr/core1
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > at
> >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
> > ... 5 lines omitted ...
> > at java.lang.Thread.run(Unknown Source)
> > Caused by: java.net.SocketException: Connection reset
> >
> >
> > After sometime, all the three servers are going down.
> >
> > Appreciate, if someone could let us know what we are missing.
> >
> > Thx,Sudhakar.
>
>

Re: Sorting by multi-valued field

2012-12-05 Thread Chris Hostetter


: (3) A third possibility I thought of was to add a field for every day of
: the year to each document that contains the next-start date for that
: particular day: next_start_20121212_dt etc. Then I could order by the
: dynamic field. But as only some of my events are recurring and few of those
: recurring over long periods of time I think it does not make too much sense.

sorting on any/all of those dynamic fields is probably not going to be 
feasible .. .especially if the user can use a date picker to select an 
arbitrary field to sort on -- the memory requirments for hte FieldCache 
(used in sorting) are going to be huge.

I would really strongly suggest you re-think your problem ... i suspect 
grouping is the only viable out of the box solution unless you write a 
custom plugin.


-Hoss

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Mark Miller

It kind of looks like the urls solrcloud is using are not accessible. When you 
go to the admin page and the cloud tab, can you access the urls it shows for 
each shard? That is, if you click on of the links or copy and paste the address 
into a web browser, does it work?

You may have to explicitly set the host= in solr.xml if it's not auto detecting 
the right one. Make sure the ports like right too.

> waitForReplicasToComeUp
> INFO: Waiting until we see more replicas up: total=2 found=1
> timeoutin=17

That happens when you stop the cluster and try to start it again - before a 
leader is chosen, it will wait for all known replicas fora shard to come up so 
that everyone can sync up and have a chance to be the best leader. So at this 
point it was only finding one of 2 known replicas and waiting for the second to 
come up. After a couple minutes (configurable) it will just continue anyway 
without the missing replica (if it doesn't show up).

- Mark

On Dec 5, 2012, at 4:21 PM, Sudhakar Maddineni  wrote:

> Hi,
> We are uploading solr documents to the index in batches using 30 threads
> and using ThreadPoolExecutor, LinkedBlockingQueue with max limit set to
> 1.
> In the code, we are using HttpSolrServer and add(inputDoc) method to add
> docx.
> And, we have the following commit settings in solrconfig:
> 
> 
>   30
>   1
>   false
> 
> 
>   
> 1000
>   
> 
> Cluster Details:
> 
> solr version - 4.0
> zookeeper version - 3.4.3 [zookeeper ensemble with 3 nodes]
> numshards=2 ,
> 001, 002, 003 are the solr nodes and these three are behind the
> loadbalancer  
> 001, 003 assigned to shard1; 002 assigned to shard2
> 
> 
> Logs:Getting the errors in the below sequence after uploading some docx:
> ---
> 003
> Dec 4, 2012 12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
> waitForReplicasToComeUp
> INFO: Waiting until we see more replicas up: total=2 found=1
> timeoutin=17
> 
> 001
> Dec 4, 2012 12:12:59 PM
> org.apache.solr.update.processor.DistributedUpdateProcessor
> doDefensiveChecks
> SEVERE: ClusterState says we are the leader, but locally we don't think so
> 
> 003
> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> SEVERE: forwarding update to <001>:8080/solr/core1/ failed - retrying ...
> 
> 001
> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> SEVERE: Error uploading: org.apache.solr.common.SolrException: Server at
> /solr/core1. returned non ok status:503, message:Service Unavailable
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> 001
> Dec 4, 2012 12:25:45 PM org.apache.solr.common.SolrException log
> SEVERE: Error while trying to recover.
> core=core1:org.apache.solr.common.SolrException: We are not the leader
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:401)
> 
> 001
> Dec 4, 2012 12:44:38 PM org.apache.solr.common.SolrException log
> SEVERE: Error uploading: org.apache.solr.client.solrj.SolrServerException:
> IOException occured when talking to server at /solr/core1
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
> ... 5 lines omitted ...
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.net.SocketException: Connection reset
> 
> 
> After sometime, all the three servers are going down.
> 
> Appreciate, if someone could let us know what we are missing.
> 
> Thx,Sudhakar.

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Sudhakar Maddineni

using solr version - 4.0 final.

Thx, Sudhakar.

On Wed, Dec 5, 2012 at 5:26 PM, Mark Miller  wrote:

> What Solr version - beta, alpha, 4.0 final, 4X or 5X?
>
> - Mark
>
> On Dec 5, 2012, at 4:21 PM, Sudhakar Maddineni 
> wrote:
>
> > Hi,
> > We are uploading solr documents to the index in batches using 30 threads
> > and using ThreadPoolExecutor, LinkedBlockingQueue with max limit set to
> > 1.
> > In the code, we are using HttpSolrServer and add(inputDoc) method to add
> > docx.
> > And, we have the following commit settings in solrconfig:
> >
> > 
> >   30
> >   1
> >   false
> > 
> >
> >   
> > 1000
> >   
> >
> > Cluster Details:
> > 
> > solr version - 4.0
> > zookeeper version - 3.4.3 [zookeeper ensemble with 3 nodes]
> > numshards=2 ,
> > 001, 002, 003 are the solr nodes and these three are behind the
> > loadbalancer  
> > 001, 003 assigned to shard1; 002 assigned to shard2
> >
> >
> > Logs:Getting the errors in the below sequence after uploading some docx:
> >
> ---
> > 003
> > Dec 4, 2012 12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
> > waitForReplicasToComeUp
> > INFO: Waiting until we see more replicas up: total=2 found=1
> > timeoutin=17
> >
> > 001
> > Dec 4, 2012 12:12:59 PM
> > org.apache.solr.update.processor.DistributedUpdateProcessor
> > doDefensiveChecks
> > SEVERE: ClusterState says we are the leader, but locally we don't think
> so
> >
> > 003
> > Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> > SEVERE: forwarding update to <001>:8080/solr/core1/ failed - retrying ...
> >
> > 001
> > Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> > SEVERE: Error uploading: org.apache.solr.common.SolrException: Server at
> > /solr/core1. returned non ok status:503, message:Service Unavailable
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > 001
> > Dec 4, 2012 12:25:45 PM org.apache.solr.common.SolrException log
> > SEVERE: Error while trying to recover.
> > core=core1:org.apache.solr.common.SolrException: We are not the leader
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:401)
> >
> > 001
> > Dec 4, 2012 12:44:38 PM org.apache.solr.common.SolrException log
> > SEVERE: Error uploading:
> org.apache.solr.client.solrj.SolrServerException:
> > IOException occured when talking to server at /solr/core1
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
> > at
> >
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> > at
> >
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
> > ... 5 lines omitted ...
> > at java.lang.Thread.run(Unknown Source)
> > Caused by: java.net.SocketException: Connection reset
> >
> >
> > After sometime, all the three servers are going down.
> >
> > Appreciate, if someone could let us know what we are missing.
> >
> > Thx,Sudhakar.
>
>

Re: SolrCloud - ClusterState says we are the leader,but locally ...

2012-12-05 Thread Mark Miller

What Solr version - beta, alpha, 4.0 final, 4X or 5X?

- Mark

On Dec 5, 2012, at 4:21 PM, Sudhakar Maddineni  wrote:

> Hi,
> We are uploading solr documents to the index in batches using 30 threads
> and using ThreadPoolExecutor, LinkedBlockingQueue with max limit set to
> 1.
> In the code, we are using HttpSolrServer and add(inputDoc) method to add
> docx.
> And, we have the following commit settings in solrconfig:
> 
> 
>   30
>   1
>   false
> 
> 
>   
> 1000
>   
> 
> Cluster Details:
> 
> solr version - 4.0
> zookeeper version - 3.4.3 [zookeeper ensemble with 3 nodes]
> numshards=2 ,
> 001, 002, 003 are the solr nodes and these three are behind the
> loadbalancer  
> 001, 003 assigned to shard1; 002 assigned to shard2
> 
> 
> Logs:Getting the errors in the below sequence after uploading some docx:
> ---
> 003
> Dec 4, 2012 12:11:46 PM org.apache.solr.cloud.ShardLeaderElectionContext
> waitForReplicasToComeUp
> INFO: Waiting until we see more replicas up: total=2 found=1
> timeoutin=17
> 
> 001
> Dec 4, 2012 12:12:59 PM
> org.apache.solr.update.processor.DistributedUpdateProcessor
> doDefensiveChecks
> SEVERE: ClusterState says we are the leader, but locally we don't think so
> 
> 003
> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> SEVERE: forwarding update to <001>:8080/solr/core1/ failed - retrying ...
> 
> 001
> Dec 4, 2012 12:12:59 PM org.apache.solr.common.SolrException log
> SEVERE: Error uploading: org.apache.solr.common.SolrException: Server at
> /solr/core1. returned non ok status:503, message:Service Unavailable
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> 001
> Dec 4, 2012 12:25:45 PM org.apache.solr.common.SolrException log
> SEVERE: Error while trying to recover.
> core=core1:org.apache.solr.common.SolrException: We are not the leader
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:401)
> 
> 001
> Dec 4, 2012 12:44:38 PM org.apache.solr.common.SolrException log
> SEVERE: Error uploading: org.apache.solr.client.solrj.SolrServerException:
> IOException occured when talking to server at /solr/core1
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
> ... 5 lines omitted ...
> at java.lang.Thread.run(Unknown Source)
> Caused by: java.net.SocketException: Connection reset
> 
> 
> After sometime, all the three servers are going down.
> 
> Appreciate, if someone could let us know what we are missing.
> 
> Thx,Sudhakar.

Re: The shard called `properties`

2012-12-05 Thread Mark Miller

See the custom hashing issue - the UI has to be updated to ignore this.

Unfortunately, it seems that clients have to be hard coded to realize 
properties is not a shard unless we add another nested layer.

Should be 100% harmless.

- Mark

On Dec 5, 2012, at 5:05 AM, Markus Jelsma  wrote:

> Hi,
> 
> We're suddenly seeing a shard called `properties` in the cloud graph page 
> when testing today's trunk with a clean Zookeeper data directory. Any idea 
> where it comes from? We have not changed the solr.xml on any node.
> 
> Thanks

Re: Solr Nightly build server down ?

2012-12-05 Thread Mark Miller

Looks like it… hopefully it comes back soon.

- Mark

On Dec 5, 2012, at 7:52 AM, shreejay  wrote:

> Hi All, 
> 
> Is the server hosting nightly builds of Solr down?
> https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/
>  
> 
> If anyone knows an alternate link to download the nightly build please let
> me know. 
> 
> 
> --Shreejay
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Nightly-build-server-down-tp4024493.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: setting hostPort for SolrCloud

2012-12-05 Thread Mark Miller

Be aware that you still have to setup tomcat to run Solr on the right port - 
and you also have to provide the port to Solr on startup. With jetty we do both 
with -Djetty.port - with Tomcat you have to setup Tomcat to run on the right 
port *and* tell Solr what that port is. By default that means also passing 
-Djetty.port - but you can change that to whatever you want in solr.xml (to 
hostPort or solr.port or whatever).

The problem is that it's difficult for a webapp to find what ports it's running 
on - you can only do it when a request actually comes in to my knowledge.

- Mark

On Dec 5, 2012, at 1:05 PM, Bill Au  wrote:

> I am using tomcat.  In my tomcat start script I have tried setting system
> properties with both
> 
> -Djetty.port=8080
> 
> and
> 
> -DhostPort=8080
> 
> but neither changed the host port for SolrCloud.  It still uses the default
> 8983.
> 
> Bill
> 
> 
> On Wed, Dec 5, 2012 at 12:11 PM, Jack Krupansky 
> wrote:
> 
>> Solr runs in a container and the container controls the port. So, you need
>> to tell the container which port to use.
>> 
>> For example,
>> 
>> java -Djetty.port=8180 -jar start.jar
>> 
>> -- Jack Krupansky
>> 
>> -Original Message- From: Bill Au
>> Sent: Wednesday, December 05, 2012 10:30 AM
>> To: solr-user@lucene.apache.org
>> Subject: setting hostPort for SolrCloud
>> 
>> 
>> Can hostPort for SolrCloud only be set in solr.xml?  I tried setting the
>> system property hostPort and jetty.port on the Java command line but
>> neither of them work.
>> 
>> Bill
>>

Re: setting hostPort for SolrCloud

2012-12-05 Thread Mark Miller

It is set in solr.xml, but solr.xml has a syntax that allows you to set values 
by system properties.

By default solr.xml is setup so that the jetty.port system property should set 
the hostPort. I'm sure that works in general, so I'm not sure why it's not 
working for you.

Can you provide your solr.xml?

- Mark

On Dec 5, 2012, at 7:30 AM, Bill Au  wrote:

> Can hostPort for SolrCloud only be set in solr.xml?  I tried setting the
> system property hostPort and jetty.port on the Java command line but
> neither of them work.
> 
> Bill

Re: setting hostPort for SolrCloud

2012-12-05 Thread Bill Au

I am using tomcat.  In my tomcat start script I have tried setting system
properties with both

-Djetty.port=8080

and

-DhostPort=8080

but neither changed the host port for SolrCloud.  It still uses the default
8983.

Bill


On Wed, Dec 5, 2012 at 12:11 PM, Jack Krupansky wrote:

> Solr runs in a container and the container controls the port. So, you need
> to tell the container which port to use.
>
> For example,
>
> java -Djetty.port=8180 -jar start.jar
>
> -- Jack Krupansky
>
> -Original Message- From: Bill Au
> Sent: Wednesday, December 05, 2012 10:30 AM
> To: solr-user@lucene.apache.org
> Subject: setting hostPort for SolrCloud
>
>
> Can hostPort for SolrCloud only be set in solr.xml?  I tried setting the
> system property hostPort and jetty.port on the Java command line but
> neither of them work.
>
> Bill
>

Re: How to make a plugin SchemaAware or XAware, runtime wise? (Solr 3.6.1)

2012-12-05 Thread Chris Hostetter


: So I'm creating a request handler plugin where I want to fetch some values
: from the schema. I make it SchemaAware but the inform method is never
: called. I assume that there is some way of registering the instance as
: aware (and I have seen and used this before, although that information
: escapes me at this point). I can't find any documentation on this in

http://wiki.apache.org/solr/SolrPlugins#Building_Plugins

Because of lifecycle complexities, not all plugins can be 
"ResourceLoaderAware" or "SolrCoreAware".

If you want to initialize your RequestHandler with information from the 
ResourceLoader, just make it SolrCoreAware and then in your 
inform(SolrCore core) method you can call core.getResourceLoader().


-Hoss

Re: not possible to apply StatsComponent to pseudo-field?

2012-12-05 Thread Jack Krupansky

Correct, but it sounds like a great feature to request - the ability to 
requests stats for a function query.


-- Jack Krupansky

-Original Message- 
From: Edward Garrett

Sent: Wednesday, December 05, 2012 10:56 AM
To: solr-user@lucene.apache.org
Subject: not possible to apply StatsComponent to pseudo-field?

hi,

i guess it's not possible to apply stats to a pseudo field? for
example if you have

&fl=*,count:termfreq(text,'solr')

you can't tack on

&fl=*,count:termfreq(text,'solr')&stats=true&stats.field=count

to give you for example the total number of times that the word 'solr'
appeared in the field 'text' for the subset of documents retrieved by
a given query?

the totaltermfreq function is limiting in that it applies across the
entire index, ignoring your query/filter query.

thanks,
edward

Solr 4.0 and SIREn plugin

2012-12-05 Thread balaji.gandhi

Hi Team,

We are looking into using the SIREn plugin with Solr 4.0. I am just
confirming if this plugin works with Solr 4.0.

Also if someone has some good documentation on this, it will be of great
help.

Thanks,
Balaji



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-and-SIREn-plugin-tp4024553.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Incremental Update of index

2012-12-05 Thread Sandeep Mestry

Hi Amit/Shanu,

You can create the solr document for only the updated record and index it
to ensure only the updated record gets indexed.
You need not rebuild indexes from scratch for every record update.

Thanks,
Sandeep

SOLR4 (sharded) and join query

2012-12-05 Thread adm1n

Hi,

I'm running some join query, let's say it looks as follows: {!join
from=some_id to=another_id}(a_id:55 AND some_type_id:3). When I run it on
single instance of SOLR I got the correct result, but when I'm running it on
the sharded system (2 shards with replica for each shard (total index counts
~300K entries)) I got partial result.

Is there any issue with supporting join queries on sharded system or may be
there is some configuration tweak, that I'm missing?


Thanlks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR4-sharded-and-join-query-tp4024547.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: FW: Replication error and Shard Inconsistencies..

2012-12-05 Thread Andre Bois-Crettez


Not sure but, maybe you are running out of file descriptors ?
On each solr instance, look at the "dashboard" admin page, there is a
bar with "File Descriptor Count".

However if this was the case, I would expect to see lots of errors in
the solr logs...

André


On 12/05/2012 06:41 PM, Annette Newton wrote:

Sorry to bombard you - final update of the day...

One thing that I have noticed is that we have a lot of connections between
the solr boxes with the connection set to CLOSE_WAIT and they hang around
for ages.

-Original Message-
From: Annette Newton [mailto:annette.new...@servicetick.com]
Sent: 05 December 2012 13:55
To: solr-user@lucene.apache.org
Subject: FW: Replication error and Shard Inconsistencies..

Update:

I did a full restart of the solr cloud setup, stopped all the instances,
cleared down zookeeper and started them up individually.  I then removed the
index from one of the replicas, restarted solr and it replicated ok.  So I'm
wondering whether this is something that happens over a period of time.

Also just to let you know I changed the schema a couple of times and
reloaded the cores on all instances previous to the problem.  Don't know if
this could have contributed to the problem.

Thanks.

-Original Message-
From: Annette Newton [mailto:annette.new...@servicetick.com]
Sent: 05 December 2012 09:04
To: solr-user@lucene.apache.org
Subject: RE: Replication error and Shard Inconsistencies..

Hi Mark,

Thanks so much for the reply.

We are using the release version of 4.0..

It's very strange replication appears to be underway, but no files are being
copied across.  I have attached both the log from the new node that I tried
to bring up and the Schema and config we are using.

I think it's probably something weird with our config, so I'm going to play
around with it today.  If I make any progress I'll send an update.

Thanks again.

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: 05 December 2012 00:04
To: solr-user@lucene.apache.org
Subject: Re: Replication error and Shard Inconsistencies..

Hey Annette,

Are you using Solr 4.0 final? A version of 4x or 5x?

Do you have the logs for when the replica tried to catch up to the leader?

Stopping and starting the node is actually a fine thing to do. Perhaps you
can try it again and capture the logs.

If a node is not listed as live but is in the clusterstate, that is fine. It
shouldn't be consulted. To remove it, you either have to unload it with the
core admin api or you could manually delete it's registered state under the
node states node that the Overseer looks at.

Also, it would be useful to see the logs of the new node coming up.there
should be info about what happens when it tries to replicate.

It almost sounds like replication is just not working for your setup at all
and that you have to tweak some configuration. You shouldn't see these nodes
as active then though - so we should get to the bottom of this.

- Mark

On Dec 4, 2012, at 4:37 AM, Annette Newton
wrote:


Hi all,

I have a quite weird issue with Solr cloud.  I have a 4 shard, 2
replica

setup, yesterday one of the nodes lost communication with the cloud setup,
which resulted in it trying to run replication, this failed, which has left
me with a Shard (Shard 4) that has one node with 2,833,940 documents on the
leader and 409,837 on the follower - obviously a big discrepancy and this
leads to queries returning differing results depending on which of these
nodes it gets the data from.  There is no indication of a problem on the
admin site other than the big discrepancy in the number of documents.  They
are all marked as active etc.


So I thought that I would force replication to happen again, by
stopping

and starting solr (probably the wrong thing to do) but this resulted in no
change.  So I turned off that node and replaced it with a new one.  In
zookeeper live nodes doesn't list that machine but it is still being shown
as active on in the ClusterState.json, I have attached images showing this.
This means the new node hasn't replaced the old node but is now a replica on
Shard 1!  Also that node doesn't appear to have replicated Shard 1's data
anyway, it didn't get marked with replicating or anything.


How do I clear the zookeeper state without taking down the entire solr

cloud setup?  How do I force a node to replicate from the others in the
shard?


Thanks in advance.

Annette Newton










--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

Re: Restricting search results by field value

2012-12-05 Thread Andre Bois-Crettez


If you do grouping on source_id, it should be enough to request 3 times
more documents than you need, then reorder and drop the bottom.

Is a 3x overhead acceptable ?



On 12/05/2012 12:04 PM, Tom Mortimer wrote:

Hi everyone,

I've got a problem where I have docs with a source_id field, and there can be 
many docs from each source. Searches will typically return docs from many 
sources. I want to restrict the number of docs from each source in results, so 
there will be no more than (say) 3 docs from source_id=123 etc.

Field collapsing is the obvious approach, but I want to get the results back in 
relevancy order, not grouped by source_id. So it looks like I'll have to fetch 
more docs than I need to and re-sort them. It might even be better to count 
source_ids in the client code and drop excess docs that way, but the potential 
overhead is large.

Is there any way of doing this in Solr without hacking in a custom Lucene 
Collector? (which doesn't look all that straightforward).

cheers,
Tom


--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

FW: Replication error and Shard Inconsistencies..

2012-12-05 Thread Annette Newton

Sorry to bombard you - final update of the day...

One thing that I have noticed is that we have a lot of connections between
the solr boxes with the connection set to CLOSE_WAIT and they hang around
for ages.

-Original Message-
From: Annette Newton [mailto:annette.new...@servicetick.com] 
Sent: 05 December 2012 13:55
To: solr-user@lucene.apache.org
Subject: FW: Replication error and Shard Inconsistencies..

Update:

I did a full restart of the solr cloud setup, stopped all the instances,
cleared down zookeeper and started them up individually.  I then removed the
index from one of the replicas, restarted solr and it replicated ok.  So I'm
wondering whether this is something that happens over a period of time. 

Also just to let you know I changed the schema a couple of times and
reloaded the cores on all instances previous to the problem.  Don't know if
this could have contributed to the problem.

Thanks.

-Original Message-
From: Annette Newton [mailto:annette.new...@servicetick.com]
Sent: 05 December 2012 09:04
To: solr-user@lucene.apache.org
Subject: RE: Replication error and Shard Inconsistencies..

Hi Mark,

Thanks so much for the reply.

We are using the release version of 4.0..

It's very strange replication appears to be underway, but no files are being
copied across.  I have attached both the log from the new node that I tried
to bring up and the Schema and config we are using.

I think it's probably something weird with our config, so I'm going to play
around with it today.  If I make any progress I'll send an update.

Thanks again.

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: 05 December 2012 00:04
To: solr-user@lucene.apache.org
Subject: Re: Replication error and Shard Inconsistencies..

Hey Annette, 

Are you using Solr 4.0 final? A version of 4x or 5x?

Do you have the logs for when the replica tried to catch up to the leader?

Stopping and starting the node is actually a fine thing to do. Perhaps you
can try it again and capture the logs.

If a node is not listed as live but is in the clusterstate, that is fine. It
shouldn't be consulted. To remove it, you either have to unload it with the
core admin api or you could manually delete it's registered state under the
node states node that the Overseer looks at.

Also, it would be useful to see the logs of the new node coming up.there
should be info about what happens when it tries to replicate.

It almost sounds like replication is just not working for your setup at all
and that you have to tweak some configuration. You shouldn't see these nodes
as active then though - so we should get to the bottom of this.

- Mark

On Dec 4, 2012, at 4:37 AM, Annette Newton 
wrote:

> Hi all,
>  
> I have a quite weird issue with Solr cloud.  I have a 4 shard, 2 
> replica
setup, yesterday one of the nodes lost communication with the cloud setup,
which resulted in it trying to run replication, this failed, which has left
me with a Shard (Shard 4) that has one node with 2,833,940 documents on the
leader and 409,837 on the follower - obviously a big discrepancy and this
leads to queries returning differing results depending on which of these
nodes it gets the data from.  There is no indication of a problem on the
admin site other than the big discrepancy in the number of documents.  They
are all marked as active etc.
>  
> So I thought that I would force replication to happen again, by 
> stopping
and starting solr (probably the wrong thing to do) but this resulted in no
change.  So I turned off that node and replaced it with a new one.  In
zookeeper live nodes doesn't list that machine but it is still being shown
as active on in the ClusterState.json, I have attached images showing this.
This means the new node hasn't replaced the old node but is now a replica on
Shard 1!  Also that node doesn't appear to have replicated Shard 1's data
anyway, it didn't get marked with replicating or anything. 
>  
> How do I clear the zookeeper state without taking down the entire solr
cloud setup?  How do I force a node to replicate from the others in the
shard?
>  
> Thanks in advance.
>  
> Annette Newton
>  
>  
>

Re: How to SWAP cores (or collections) with SolrCloud (SOLR-3866)

2012-12-05 Thread Andre Bois-Crettez


On 12/05/2012 02:09 AM, Mark Miller wrote:

On Dec 4, 2012, at 4:57 AM, Andre Bois-Crettez  wrote:


* what can we do to help progress on SOLR-3866 ? Maybe use case
scenarios, detailing desired behavior ? Constrains on what cores or
collections are allowed to SWAP, ie. same config, same doc->shard
assignments ?


Yes please - if you could elaborate on that issue, I can help you try and get 
something in.

- Mark

Thanks.

I detailed our use case on the jira issue.
Hoping it is clear enough, let me know if anything more is needed.

--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.

Re: [Solrj 4.0] How use JOIN

2012-12-05 Thread iwo

Hi Roman,
  from solr webservice or admin interface you receive correct result

this is you correct join query syntax
{!join from=parent to=id}(name:John AND age:17) 

you can add this directly by query.setQuery() method.










-
Complicare è facile, semplificare é difficile. 
Complicated is easy, simple is hard.
quote: http://it.wikipedia.org/wiki/Bruno_Munari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-4-0-How-use-JOIN-tp4024262p4024525.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: setting hostPort for SolrCloud

2012-12-05 Thread Jack Krupansky

Solr runs in a container and the container controls the port. So, you need 
to tell the container which port to use.


For example,

java -Djetty.port=8180 -jar start.jar

-- Jack Krupansky

-Original Message- 
From: Bill Au

Sent: Wednesday, December 05, 2012 10:30 AM
To: solr-user@lucene.apache.org
Subject: setting hostPort for SolrCloud

Can hostPort for SolrCloud only be set in solr.xml?  I tried setting the
system property hostPort and jetty.port on the Java command line but
neither of them work.

Bill

Re: Getting deleted documents during DIH full-import

2012-12-05 Thread Shawn Heisey


On 12/5/2012 9:19 AM, Erick Erickson wrote:

Probably what you're seeing is that as segments are merged, deleted
documents are purged.

As to how the deleted docs got there in the first place, were you using an
index that had been populated before?


After sleeping on it, I also realized that it was merges removing the 
deleted docs.  Then I read your message confirming the idea.


The first thing the indexing program does to all build cores before 
kicking off DIH is deleteByQuery("*:*"), commit, and optimize.  The 
full-import is not called with clean=false, so that should be another 
thing that wipes the index.


After the import is done and my solrj app makes things completely 
current, if I do a count(*) on the database table, I do get the same 
number (78626805 at this moment) as when I do a distributed search for 
*:* on Solr.  The anomaly during import concerns me, but it doesn't 
appear to be causing any real problems.


Thanks,
Shawn

Re: Solr 4.0 ngroups issue workaround

2012-12-05 Thread Jack Krupansky

Sorry, not that I know of. This is a disturbing design issue. I mean, I 
understand the value of the feature, but resolving it is not currently 
feasible, other than by by extremely slow brute force - but, maybe that 
option should be pursued anyway for situations where "zippy performance" is 
not an absolute requirement.


-- Jack Krupansky

-Original Message- 
From: shreejay

Sent: Wednesday, December 05, 2012 11:42 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.0 ngroups issue workaround

Hi All,

I have a Solrcloud instance with 6 million documents. We are using the
ngroups feature in a few places and I am aware that this is still a JIRA
issue with work in progress (and some patches).

Apart from using the patch  here
  , and re-indexing data so
that all documents with same group field are on the same server, has anyone
else tried or used any alternate methods?

I wanted to see if there would be any other options before re-indexing.

Thanks.

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ngroups-issue-workaround-tp4024513.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to make a plugin SchemaAware or XAware, runtime wise? (Solr 3.6.1)

2012-12-05 Thread Per Fredelius

So I'm creating a request handler plugin where I want to fetch some values
from the schema. I make it SchemaAware but the inform method is never
called. I assume that there is some way of registering the instance as
aware (and I have seen and used this before, although that information
escapes me at this point). I can't find any documentation on this in
particular and neither can I find any calls doing this in the solr
source.
But this could very well be done outside the

So, is there some procedure for initiating getting inform calls?

/Thanks :)

Concern with using external SQL server for DIH

2012-12-05 Thread Spadez

Hi,

I am looking to import entries to my SOLR server by using the DIH,
connecting to an external postgre SQL server using the JDBC driver. I will
be importing about 50,000 entries each time. 

Is connecting to an external SQL server for my data unreliable or risky, or
is it instead perfrectly reasonable?

My alternative is to export the SQL file on the other server, download the
SQL file to my SOLR server, import it to my Solr servers copy of postgreSQL
and then run the DIH on the local database.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Concern-with-using-external-SQL-server-for-DIH-tp4024514.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr 4.0 ngroups issue workaround

2012-12-05 Thread shreejay

Hi All,

I have a Solrcloud instance with 6 million documents. We are using the
ngroups feature in a few places and I am aware that this is still a JIRA
issue with work in progress (and some patches). 

Apart from using the patch  here
  , and re-indexing data so
that all documents with same group field are on the same server, has anyone
else tried or used any alternate methods? 

I wanted to see if there would be any other options before re-indexing. 

Thanks. 

--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-ngroups-issue-workaround-tp4024513.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting deleted documents during DIH full-import

2012-12-05 Thread Erick Erickson

Probably what you're seeing is that as segments are merged, deleted
documents are purged.

As to how the deleted docs got there in the first place, were you using an
index that had been populated before?

Best
Erick


On Tue, Dec 4, 2012 at 5:06 PM, Shawn Heisey  wrote:

> On 12/4/2012 5:33 PM, Shawn Heisey wrote:
>
>> I am doing a DIH full import on a very recent checkout from branch_4x.
>>  Something I've recently done differently is enabling autocommit.  I am
>> seeing that there are deleted documents in some of the indexes.  See
>> "Development Build Indexes" at the bottom of the following screenshot.
>>  When the import is complete, the numbered shards will contain 13 million
>> documents.
>>
>> http://dl.dropbox.com/u/**97770508/statuspage-deletes-**import.png
>>
>> The MySQL database that this imports from has a unique index on the field
>> that Solr is using for its UniqueKey, soit's not possible to have
>> duplicates.  Each import uses one SELECT statement for the entire 13
>> million document import.  What might be leading to these deleted docs?
>>
>
> Interesting development:  The imports are now up to over 11 million
> documents, but now the number of deleted documents on all shards is zero.
>
> I calculate deleted documents on my stats page by subtracting numDocs from
> maxDoc, information gathered from /admin/mbeans?stats=true.
>
> Thanks,
> Shawn
>
>

Re: SOLR4 cluster - strange CPU spike on slave

2012-12-05 Thread Erick Erickson

In addition to Mark's comment, be sure you aren't starving the memory for
the OS by over-allocating to the JVM.

FWIW,
Erick


On Tue, Dec 4, 2012 at 2:25 AM, John Nielsen  wrote:

> Success!
>
> I tried adding -XX:+UseConcMarkSweepGC to java to make it GC earlier. We
> haven't seen any spikes since.
>
> I'm cautiously optimistic though and will be monitoring the servers for a
> week or so before declaring final victory.
>
> The post about mmapdirectory is really interesting. We switched to using
> that from NRTCachingDirectory and am monitoring performance as well.
> Initially performance doesn't look stellar, but i suspect that we lack
> memory in the server to really make it shine.
>
>
> Med venlig hilsen / Best regards
>
> *John Nielsen*
> Programmer
>
>
>
> *MCB A/S*
> Enghaven 15
> DK-7500 Holstebro
>
> Kundeservice: +45 9610 2824
> p...@mcb.dk
> www.mcb.dk
>
>
>
> On Fri, Nov 30, 2012 at 3:13 PM, Erick Erickson  >wrote:
>
> > right, so here's what I'd check for.
> >
> > Your logs should show a replication pretty coincident with the spike and
> > that should be in the log. Note: the replication should complete just
> > before the spike.
> >
> > Or you can just turn replication off and fire it manually to try to force
> > the situation at will, see:
> > http://wiki.apache.org/solr/SolrReplication#HTTP_API. (but note that
> > you'll
> > have to wait until the index has changed on the master to see any
> action).
> >
> > So you should be able to create your spike at will. And this will be
> pretty
> > normal. When replication happens, a new searcher is opened, caches are
> > filled, autowarming is done, all kinds of stuff like that. During this
> > period, the _old_ searcher is still open, which will both cause the CPU
> to
> > be busier and require additional memory. Once the new searcher is warmed,
> > new queries go to it, and when the old searcher has finished serving all
> > the queries it shuts down and all the resources are freed. Which is why
> > commits are expensive operations.
> >
> > All of which means that so far I don't think there's a problem, this is
> > just normal Solr operation. If you're seeing responsiveness problems when
> > serving queries you probably want to throw more hardware (particularly
> > memory) at the problem.
> >
> > But when thinking about memory allocating to the JVM, _really_ read Uwe's
> > post here:
> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> >
> > Best
> > Erick
> >
> >
> > On Thu, Nov 29, 2012 at 2:39 AM, John Nielsen  wrote:
> >
> > > Yup you read it right.
> > >
> > > We originally intended to do all our indexing to varnish02, replicate
> to
> > > varnish01 and then search from varnish01 (through a fail-over ip which
> > > would switch the reader to varnish02 in case of trouble).
> > >
> > > When I saw the spikes, I tried to eliminate possibilities by starting
> > > searching from varnish02, leaving varnish01 with nothing to do but to
> > > receive replication data. This did not remove the spikes. As soon as
> this
> > > spike is fixed, I will start searching from varnish01 again. These sort
> > of
> > > debug antics are only possible because, although we do have customers
> > using
> > > this, we are still in our beta phase.
> > >
> > > Varnish01 never receives any manual commit orders. Varnish02 does from
> > time
> > > to time.
> > >
> > > Oh, and I accidentally misinformed you before. (damn secondary
> language)
> > We
> > > are actually seeing the spikes on both servers. I was just focusing on
> > > varnish01 because I use it to eliminate possibilities.
> > >
> > > It just occurred to me now; We tried switching off our feeder/index
> tool
> > > for 24 hours, and we didn't see any spikes during this period, so
> > receiving
> > > replication data certainly has something to do with it.
> > >
> > > Med venlig hilsen / Best regards
> > >
> > > *John Nielsen*
> > > Programmer
> > >
> > >
> > >
> > > *MCB A/S*
> > > Enghaven 15
> > > DK-7500 Holstebro
> > >
> > > Kundeservice: +45 9610 2824
> > > p...@mcb.dk
> > > www.mcb.dk
> > >
> > >
> > >
> > > On Thu, Nov 29, 2012 at 3:20 AM, Erick Erickson <
> erickerick...@gmail.com
> > > >wrote:
> > >
> > > > Am I reading this right? All you're doing on varnish1 is replicating
> to
> > > it?
> > > > You're not searching or indexing? I'm sure I'm misreading this.
> > > >
> > > >
> > > > "The spike, which only lasts for a couple of minutes, sends the disks
> > > > racing" This _sounds_ suspiciously like segment merging, especially
> the
> > > > "disks racing" bit. Or possibly replication. Neither of which make
> much
> > > > sense. But is there any chance that somehow multiple commits are
> being
> > > > issued? Of course if varnish1 is a slave, that shouldn't be happening
> > > > either.
> > > >
> > > > And the whole bit about nothing going to the logs is just bizarre.
> I'm
> > > > tempted to claim hardware gremlins, especially if you see nothing
> > similar
> > > > on varnish2. Or

Re: Facet with large number of unigram entries

2012-12-05 Thread Erick Erickson

This is really not the use-case faceting is designed for, I don't think
there's really any good way to speed this case up. What is the higher-level
issue you're trying to solve? Perhaps there's a better way to do it.

I'm not sure why you think altering the cache settings would help, they're
really pretty irrelevant to the faceting

Best
Erick


On Tue, Dec 4, 2012 at 12:45 AM, Andreas Niekler <
aniek...@informatik.uni-leipzig.de> wrote:

> Dear List,
>
> i have an index with 2.000.000 articles. All those texts get tokenized
> while indexing. On this data i run a faceted query like this (to receive
> associated words):
>
> select?q=a_spell:{some word}&facet.method=enum&facet=**
> true&facet.field=Paragraph&**facet.limit=10&facet.prefix={**some
> prefix}&facet.mincount=1500&**indent=1&fl=_id&wt=json&rows=**0'
>
>
> I have more than 5.000.000 unique token in the index and the facet query
> is quite slow. I also tried differnt FastLRUcache  Settings in the
> FilterCache.
>
> Has Anybody a hint how i could improve performance within this setup?
>
> Thnak you all
>
> --
> Andreas Niekler, Dipl. Ing. (FH)
> NLP Group | Department of Computer Science
> University of Leipzig
> Johannisgasse 26 | 04103 Leipzig
>
> mail: 
> aniek...@informatik.uni-**leipzig.deg.de
>

RE: Turn on Unicode support

2012-12-05 Thread Nguyen, Vincent (CDC/OD/OADS) (CTR)

Thanks a lot Shawn!  You were right, I didn't think to look into Tomcat.  I 
enabled UTF8 in tomcat and everything works now, Thanks

Vincent Vu Nguyen

-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Wednesday, December 05, 2012 10:12 AM
To: solr-user@lucene.apache.org
Subject: Re: Turn on Unicode support

On 12/5/2012 7:31 AM, Nguyen, Vincent (CDC/OD/OADS) (CTR) wrote:
> Is there a way to turn on support for Unicode characters in version 1.4.1?  
> The strange thing is that my coworker and I are supposed to have the same 
> configuration, yet on her machine, there seems to be Unicode support enabled.
>
> For example, if I use the SOLR admin to do a search for the a term with the 
> 'Registered trademark ®' character, it will translate to 'â®'

Solr has full UTF8/Unicode support already.  It runs in a servlet container, 
like Jetty, Tomcat, etc.  The container must also be set up to use the UTF8 
character set.  For interaction with Solr, your browser must also be set up for 
UTF8, which in some cases may mean your OS needs to have its locale settings 
changed.

Whatever source of index data you are using must also provide UTF8 data to 
Solr.  For example, if your installation of Solr is using the dataimport 
handler, you must also set up your data source as UTF8.

Thanks,
Shawn

RE: DIH nested entities don't work

2012-12-05 Thread Dyer, James

Maarten,

Glad to hear that your DIH experiment worked well for you.  

To implement something like Endeca's guided navigation, see 
http://wiki.apache.org/solr/SolrFacetingOverview .  If you need to implement 
multi-level faceting, see http://wiki.apache.org/solr/HierarchicalFaceting  
(but caution:  some of the techniques here are for non-committed feature 
patches).

If you're trying to do anything but the most simple cases, I would recommend 
getting yourself a good book that walks you through it, such as Smiley&Pugh's 
Solr Book.  There is also a lot to read on the topic from mail lists archives, 
blog posts, etc.  

The hardest thing for us in going from Endeca was that Solr doesn't have the 
"N-Value" concept.  So if you want to drill down on, say "Department", you 
might do something like this:  facet=true&facet.field=DEPARTMENT , whereas 
Endeca would generate some esoteric N-value in an obscure Xml file so you would 
end up with a query like N=4567897865 .  Unfortunately for us, we had those 
N-values hardcoded all over our application and we ended up having to create a 
cross-reference table so that we didn't have to rewrite a ton of code at once.  

Overall, Solr's faceting is a lot more flexible that what Endeca has to offer.  
And its a lot simpler to set up and understand.  However, Endeca's strong point 
here is that an admin could configure a lot of behaviors on the back end and 
then developers could just write to the API and it would do everything for 
them.  (Of course, this also encourages you to write your app so that its 
harder to convert to anything else.)  We were able to convert from Endeca with 
Solr 1.4, including 2- & 3- level Dimensions, using facet.prefix.  The new 
features in Solr3 & especially Solr4 should make it easier and more efficient 
though.

If you have more questions about faceting, I would start a new discussion 
thread about it.  There are a lot of approaches to solving various problems so 
you may get a variety of answers.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: mroosendaal [mailto:mroosend...@yahoo.com] 
Sent: Wednesday, December 05, 2012 8:23 AM
To: solr-user@lucene.apache.org
Subject: RE: DIH nested entities don't work

Hi James,

Just to let you know, i've just completed the PoC and it worked great!
Thanks.

What i still find difficult to is how to implement a 'guided' navigation
with Solr. That is one of the strenghts of Endeca and with Solr you have to
create this yourself. What are your thoughts on that and what challenges did
you encounter when moving to Endeca?

Thanks,
Maarten



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-nested-entities-don-t-work-tp4015514p4024467.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr Nightly build server down ?

2012-12-05 Thread shreejay

Hi All, 

Is the server hosting nightly builds of Solr down?
https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/
 

If anyone knows an alternate link to download the nightly build please let
me know. 


--Shreejay




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Nightly-build-server-down-tp4024493.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: [Solrj 4.0] How use JOIN

2012-12-05 Thread Roman Slavik

Nobody?

I just found nested queries can be used
(http://mullingmethodology.blogspot.cz/2012/03/adventures-with-solr-join.html).
But I don't like this solution, it is too complicated and not very readable
... 

So is there any way how use JOIN with Solrj? Any idea? :)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-4-0-How-use-JOIN-tp4024262p4024490.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Cannot run Solr4 from Intellij Idea

2012-12-05 Thread Steve Rowe

Hi Artyom,

I don't use IntelliJ artifacts - I just edit/compile/test.

I can include this stuff in the IntelliJ configuration if you'll help me.  Can 
you share screenshots of what you're talking about, and/or IntelliJ config 
files?

Steve

On Dec 5, 2012, at 8:24 AM, Artyom  wrote:

> InelliJ IDEA is not so intelligent with Solr: to fix this problem I've
> dragged these modules into the IDEA's artifact (parent module is wrong):
> 
> analysis-common
> analysis-extras
> analysis-uima
> clustering
> codecs
> codecs-resources
> dataimporthandler
> dataimporthandler-extras
> lucene-core
> lucene-core-resources
> solr-core
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Cannot-run-Solr4-from-Intellij-Idea-tp4024233p4024452.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4 in IntelliJ IDEA: make project errors

2012-12-05 Thread Steve Rowe

Hi Artyom,

The lucene_solr_4_0 branch IntelliJ setup works for me.

Sounds like Ivy isn't succeeding in downloading dependencies.

'ant idea' calls 'ant resolve', which uses Apache Ivy to download binary 
dependencies.

Can you post output from running 'ant resolve' at the top level?

Steve
 
On Dec 5, 2012, at 6:55 AM, Artyom  wrote:

> I have followed this instruction:
> http://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ
> 
> 1. Downloaded lucene_solr_4_0 branch
> 2. ant idea
> 3. opened this project in IDEA
> 4. clicked Build/Make project menu
> 
> I got message: Compilation completed with 111 errors and 9 warnings
> C:\solr\lucene\benchmark\src\java\org\apache\lucene\benchmark\byTask\utils\StreamUtils.java
> Error: (32, 47) package org.apache.commons.compress.compressors does not
> exist
> Error: (33, 47) package org.apache.commons.compress.compressors does not
> exist
> ...
> 
> what should I do to make the project without errors?
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-4-in-IntelliJ-IDEA-make-project-errors-tp4024439.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Turn on Unicode support

2012-12-05 Thread Shawn Heisey


On 12/5/2012 7:31 AM, Nguyen, Vincent (CDC/OD/OADS) (CTR) wrote:

Is there a way to turn on support for Unicode characters in version 1.4.1?  The 
strange thing is that my coworker and I are supposed to have the same 
configuration, yet on her machine, there seems to be Unicode support enabled.

For example, if I use the SOLR admin to do a search for the a term with the 
'Registered trademark ®' character, it will translate to 'â®'


Solr has full UTF8/Unicode support already.  It runs in a servlet 
container, like Jetty, Tomcat, etc.  The container must also be set up 
to use the UTF8 character set.  For interaction with Solr, your browser 
must also be set up for UTF8, which in some cases may mean your OS needs 
to have its locale settings changed.


Whatever source of index data you are using must also provide UTF8 data 
to Solr.  For example, if your installation of Solr is using the 
dataimport handler, you must also set up your data source as UTF8.


Thanks,
Shawn

Re: how make a suggester?

2012-12-05 Thread iwo

There are some positive experience about suggester over Solr 4.0 with cloud?
I downloaded solr4, followed http://wiki.apache.org/solr/Suggester but I
don't get suggestion items

Any ideas?



-
Complicare è facile, semplificare é difficile. 
Complicated is easy, simple is hard.
quote: http://it.wikipedia.org/wiki/Bruno_Munari
--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-make-a-suggester-tp4020540p4024476.html
Sent from the Solr - User mailing list archive at Nabble.com.

Turn on Unicode support

2012-12-05 Thread Nguyen, Vincent (CDC/OD/OADS) (CTR)

Is there a way to turn on support for Unicode characters in version 1.4.1?  The 
strange thing is that my coworker and I are supposed to have the same 
configuration, yet on her machine, there seems to be Unicode support enabled.  

For example, if I use the SOLR admin to do a search for the a term with the 
'Registered trademark ®' character, it will translate to 'â®'

Is there something I'm missing in the configuration?

Vincent Vu Nguyen

RE: DIH nested entities don't work

2012-12-05 Thread mroosendaal

Hi James,

Just to let you know, i've just completed the PoC and it worked great!
Thanks.

What i still find difficult to is how to implement a 'guided' navigation
with Solr. That is one of the strenghts of Endeca and with Solr you have to
create this yourself. What are your thoughts on that and what challenges did
you encounter when moving to Endeca?

Thanks,
Maarten



--
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-nested-entities-don-t-work-tp4015514p4024467.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is it ok to have a alphanumberic ID for entries?

2012-12-05 Thread Jack Krupansky

No, your Solr unique key field should ALWAYS be of type "string". In some 
cases you can get away with other types, but eventually you may run into 
some Solr feature which requires that the unique key field be a string, so 
it is better to avoid the potential problems from the start. You can put 
letters or digits in the string as you please, but keep the type as 
"string".


Also, DO NOT change the type of a field unless you are prepared to fully 
reindex your data.


-- Jack Krupansky

-Original Message- 
From: Otis Gospodnetic

Sent: Wednesday, December 05, 2012 7:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Is it ok to have a alphanumberic ID for entries?

That would be fine.

Otis
--
SOLR Performance Monitoring - http://sematext.com/spm
On Dec 5, 2012 7:14 AM, "Spadez"  wrote:


I was wondering, with my current setup I have an ID field which is a
number.
If I wanted to then change it so my ID field was actually a mix of numbers
and letters (to do with the backend system), would this cause any sort of
problem?

I would never do any kind of sorting by ID on my search page, the ID will
purely be used in order to update and delete entries from the solr index.





--
View this message in context:
http://lucene.472066.n3.nabble.com/Is-it-ok-to-have-a-alphanumberic-ID-for-entries-tp4024441.html
Sent from the Solr - User mailing list archive at Nabble.com.

FW: Replication error and Shard Inconsistencies..

2012-12-05 Thread Annette Newton

Update:

I did a full restart of the solr cloud setup, stopped all the instances,
cleared down zookeeper and started them up individually.  I then removed the
index from one of the replicas, restarted solr and it replicated ok.  So I'm
wondering whether this is something that happens over a period of time. 

Also just to let you know I changed the schema a couple of times and
reloaded the cores on all instances previous to the problem.  Don't know if
this could have contributed to the problem.

Thanks.

-Original Message-
From: Annette Newton [mailto:annette.new...@servicetick.com] 
Sent: 05 December 2012 09:04
To: solr-user@lucene.apache.org
Subject: RE: Replication error and Shard Inconsistencies..

Hi Mark,

Thanks so much for the reply.

We are using the release version of 4.0..

It's very strange replication appears to be underway, but no files are being
copied across.  I have attached both the log from the new node that I tried
to bring up and the Schema and config we are using.

I think it's probably something weird with our config, so I'm going to play
around with it today.  If I make any progress I'll send an update.

Thanks again.

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: 05 December 2012 00:04
To: solr-user@lucene.apache.org
Subject: Re: Replication error and Shard Inconsistencies..

Hey Annette, 

Are you using Solr 4.0 final? A version of 4x or 5x?

Do you have the logs for when the replica tried to catch up to the leader?

Stopping and starting the node is actually a fine thing to do. Perhaps you
can try it again and capture the logs.

If a node is not listed as live but is in the clusterstate, that is fine. It
shouldn't be consulted. To remove it, you either have to unload it with the
core admin api or you could manually delete it's registered state under the
node states node that the Overseer looks at.

Also, it would be useful to see the logs of the new node coming up.there
should be info about what happens when it tries to replicate.

It almost sounds like replication is just not working for your setup at all
and that you have to tweak some configuration. You shouldn't see these nodes
as active then though - so we should get to the bottom of this.

- Mark

On Dec 4, 2012, at 4:37 AM, Annette Newton 
wrote:

> Hi all,
>  
> I have a quite weird issue with Solr cloud.  I have a 4 shard, 2 
> replica
setup, yesterday one of the nodes lost communication with the cloud setup,
which resulted in it trying to run replication, this failed, which has left
me with a Shard (Shard 4) that has one node with 2,833,940 documents on the
leader and 409,837 on the follower - obviously a big discrepancy and this
leads to queries returning differing results depending on which of these
nodes it gets the data from.  There is no indication of a problem on the
admin site other than the big discrepancy in the number of documents.  They
are all marked as active etc.
>  
> So I thought that I would force replication to happen again, by 
> stopping
and starting solr (probably the wrong thing to do) but this resulted in no
change.  So I turned off that node and replaced it with a new one.  In
zookeeper live nodes doesn't list that machine but it is still being shown
as active on in the ClusterState.json, I have attached images showing this.
This means the new node hasn't replaced the old node but is now a replica on
Shard 1!  Also that node doesn't appear to have replicated Shard 1's data
anyway, it didn't get marked with replicating or anything. 
>  
> How do I clear the zookeeper state without taking down the entire solr
cloud setup?  How do I force a node to replicate from the others in the
shard?
>  
> Thanks in advance.
>  
> Annette Newton
>  
>  
>

Re: Cannot run Solr4 from Intellij Idea

2012-12-05 Thread Artyom

InelliJ IDEA is not so intelligent with Solr: to fix this problem I've
dragged these modules into the IDEA's artifact (parent module is wrong):

analysis-common
analysis-extras
analysis-uima
clustering
codecs
codecs-resources
dataimporthandler
dataimporthandler-extras
lucene-core
lucene-core-resources
solr-core



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cannot-run-Solr4-from-Intellij-Idea-tp4024233p4024452.html
Sent from the Solr - User mailing list archive at Nabble.com.

The shard called `properties`

2012-12-05 Thread Markus Jelsma

Hi,

We're suddenly seeing a shard called `properties` in the cloud graph page when 
testing today's trunk with a clean Zookeeper data directory. Any idea where it 
comes from? We have not changed the solr.xml on any node.

Thanks

Re: Is it ok to have a alphanumberic ID for entries?

2012-12-05 Thread Otis Gospodnetic

That would be fine.

Otis
--
SOLR Performance Monitoring - http://sematext.com/spm
On Dec 5, 2012 7:14 AM, "Spadez"  wrote:

> I was wondering, with my current setup I have an ID field which is a
> number.
> If I wanted to then change it so my ID field was actually a mix of numbers
> and letters (to do with the backend system), would this cause any sort of
> problem?
>
> I would never do any kind of sorting by ID on my search page, the ID will
> purely be used in order to update and delete entries from the solr index.
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Is-it-ok-to-have-a-alphanumberic-ID-for-entries-tp4024441.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Is it ok to have a alphanumberic ID for entries?

2012-12-05 Thread Spadez

I was wondering, with my current setup I have an ID field which is a number.
If I wanted to then change it so my ID field was actually a mix of numbers
and letters (to do with the backend system), would this cause any sort of
problem? 

I would never do any kind of sorting by ID on my search page, the ID will
purely be used in order to update and delete entries from the solr index.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-ok-to-have-a-alphanumberic-ID-for-entries-tp4024441.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr 4 in IntelliJ IDEA: make project errors

2012-12-05 Thread Artyom

I have followed this instruction:
http://wiki.apache.org/lucene-java/HowtoConfigureIntelliJ

1. Downloaded lucene_solr_4_0 branch
2. ant idea
3. opened this project in IDEA
4. clicked Build/Make project menu

I got message: Compilation completed with 111 errors and 9 warnings
C:\solr\lucene\benchmark\src\java\org\apache\lucene\benchmark\byTask\utils\StreamUtils.java
Error: (32, 47) package org.apache.commons.compress.compressors does not
exist
Error: (33, 47) package org.apache.commons.compress.compressors does not
exist
...

what should I do to make the project without errors?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-in-IntelliJ-IDEA-make-project-errors-tp4024439.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4 : Optimize very slow

2012-12-05 Thread Sandeep Mestry

@ Walter, the daily optimization was introduced as we saw a decrease in the
performance for searches that happen during the peak hours - when loads of
updates take place on index. The load testing was proved slightly
successfull on optimized indexes. As a matter of fact, the merge factor was
increased from 10 to 30 to make it acceptable.

@Upayavira , thanks for the inputs. I will try to avoid the daily
optimizations however its sort of the workplace policy not to alter
anything except the essential configs for this release of project. I take
your point that the daily optimizations are unnecessary even then its hard
to imagine why they take 6-8 hours a day when previously they were finished
within half an hour.

@Michael, thank for poitning that out, I will try using
solr.NIOFSDirectoryFactory
as currently I'm using the default one. Regarding your questions,
- Nothing has changed between solr 1.4 and solr 4 except the solr config. I
have built 2 separate environments using solr 1.4 and solr 4 with the same
application code, db config etc. and can see the difference in the
optimization timings.
- I will check the solr stats for gc and also during optimization. I see
that the index size reaches to 17 Gig from 8.5G and the CPU utilization
then is the highest..
And I meant WAS only as in Websphere Application Server.

@Otis, a quick google for optimize wunder Erick Otis results in this mail
chain (ha ha !), but I will dig the mail archives, thank you for your
suggestion..

Have a good day all, I will come back with my findings..

Best,
Sandeep


On 5 December 2012 06:07, Walter Underwood  wrote:

> It was not necessary under 1.4. It has never been necessary.
>
> It was not necessary in Ultraseek Server in 1996, using the same merging
> model.
>
> In some cases, it can be a good idea. Since you are continuously updating,
> this is not one of those cases.
>
> wunder
>
> On Dec 4, 2012, at 9:29 PM, Upayavira wrote:
>
> > I tried that search, without success :-(
> >
> > I suspect what Otis was trying to say was to question why you are
> > optimising. Optimise was necessary under 1.4, but with newer Solr, the
> > new TieredMergePolicy does a much better job of handling background
> > merging, reducing the need for optimize. Try just not doing it at all
> > and see if your index actually reaches a point where it is needed.
> >
> > Upayavira
> >
> > On Wed, Dec 5, 2012, at 12:31 AM, Otis Gospodnetic wrote:
> >> Hi,
> >>
> >> You should search the ML archives for : optimize wunder Erick Otis :)
> >>
> >> Is WAS really AWS? If so, if these are new EC2 instances you are
> >> unfortunately unable to do a fair apples to apples comparison. Have you
> >> tried a different set of instances?
> >>
> >> Otis
> >> --
> >> Performance Monitoring - http://sematext.com/spm
> >> On Dec 4, 2012 6:29 PM, "Sandeep Mestry"  wrote:
> >>
> >>> Hi All,
> >>>
> >>> I have recently migrated from solr 1.4 to solr 4 and have done the
> basic
> >>> changes required for solr 4 in solrconfig.xml and schema.xml. I have
> also
> >>> rebuilt the index set for solr 4.
> >>> We run optimize every morning at 4 am and we keep the index updates off
> >>> during this process.
> >>> Previously, with 1.4 - the optimization used to take around 20-30 mins
> per
> >>> shard but now with solr 4, its taking 6-8 hours or even more..
> >>> I have also tested the optimize from solr UI and that takes 6-8 hours
> too..
> >>> The hardware is saeme and, we have deployed solr under WAS.
> >>> There ar 4 shards and every shard contains around 8 - 9 Gig of data
> and we
> >>> are using master-slave configuration with rsync. I have not enabled
> soft
> >>> commit. Also, commiter process is scheduled to run every minute.
> >>>
> >>> I am not sure which part I'm missing, do let me know your inputs
> please.
> >>>
> >>> Many Thanks in advance,
> >>> Sandeep
> >>>
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>
>

Restricting search results by field value

2012-12-05 Thread Tom Mortimer

Hi everyone,

I've got a problem where I have docs with a source_id field, and there can be 
many docs from each source. Searches will typically return docs from many 
sources. I want to restrict the number of docs from each source in results, so 
there will be no more than (say) 3 docs from source_id=123 etc.

Field collapsing is the obvious approach, but I want to get the results back in 
relevancy order, not grouped by source_id. So it looks like I'll have to fetch 
more docs than I need to and re-sort them. It might even be better to count 
source_ids in the client code and drop excess docs that way, but the potential 
overhead is large.

Is there any way of doing this in Solr without hacking in a custom Lucene 
Collector? (which doesn't look all that straightforward).

cheers,
Tom

Re: Change searching field using Solritas

2012-12-05 Thread Romita Saha

Hi,

I have found out the solution to my own question. I need to change the qf 
parameter in solrconfig file.

Thanks and regards,
Romita 



From:   "Romita Saha" 
To: solr-user@lucene.apache.org, 
Date:   12/05/2012 06:20 PM
Subject:Change searching field using Solritas



Hi,

I am trying to change the Solr browser UI using Solritas. What I 
understand is  that if I want to search something and  type in the FIND 
box and click 'Submit Query', Solr always searches in the field 'name' and 

give the result accordingly. However I want to search in some other field. 

I am not able to find out where in Velocity conf files I need to modify to 

search in a field other than the field 'name'. Could you please help me 
find it out.

Thanks and regards,
Romita

Re: Sorting by multi-valued field

2012-12-05 Thread Thomas Heigl

Hey Guys,

Thanks a lot for your input!

But my interpretation of "the next" start time is that it wsa dependent on
> the value of "NOW" when the query was executed (ie: some of the indexed
> values may be in the past) in which case that approach wouldn't work.

If the query was always a NOW query, there would be no problem. I could
just re-index the events once a day and re-adjust the value for next start.
The problem is that the user has a calendar/datepicker view and can choose
a future date within the next year and view events ordered by next-start
after that specific day.

As I see it now, I have two options:

1) Create a custom function query as suggested by Chris
2) Index seperate documents for every start time and group them by event at
query time
(3) A third possibility I thought of was to add a field for every day of
the year to each document that contains the next-start date for that
particular day: next_start_20121212_dt etc. Then I could order by the
dynamic field. But as only some of my events are recurring and few of those
recurring over long periods of time I think it does not make too much sense.

I might go for option two for now as I'm not a big fan of creating (and
especially maintaining) custom components.

Or is someone with an even better idea out there? ;)

Cheers,

Thomas

On Tue, Dec 4, 2012 at 11:34 PM, Chris Hostetter
wrote:

>
> : But it would be a lot harder than either splitting them out into
> : separate docs, or writing code to re-index docs when one of their
> : 'next-event' dates passes, with a new single valued 'next-event' field.
> : Less efficient, but easier to write/manage.
>
> Don't get me wrong -- if you can determine at index time which single
> value you wnat to use to sort on then by all means that is going to be the
> best approach -- it's precisely the reason why
> FirstFieldValueUpdateProcessorFactory,
> LastFieldValueUpdateProcessorFactory, MaxFieldValueUpdateProcessorFactory,
> and MinFieldValueUpdateProcessorFactory.
>
> But my interpretation of "the next" start time is that it wsa dependent on
> the value of "NOW" when the query was executed (ie: some of the indexed
> values may be in the past) in which case that approach wouldn't work.
>
> : On Tue, Dec 4, 2012, at 07:35 PM, Chris Hostetter wrote:
> : >
> : > : perfectly, but users expect the result set to be ordered by the next
> : > start
> : > : time.
> : > ...
> : > : Is there a more elegant way to do this in Solr? A function query or
> : > : subquery maybe? I thought about it for quite a while and couldn't
> come
> : > up
> : > : with a viable solution.
> : >
> : > I think you could concievably write a custom function that built an
> : > UnInvertedField over your multivalued field, and then returned the
> : > "lowest
> : > value for each doc where the value is after 'NOW'" but there is nothing
> : > out of the box that will do this for you (and i haven't really thought
> : > hard about how viable this approach is ... i can't think of any obvious
> : > problems off the top of my head)
> : >
> : > -Hoss
> :
>
> -Hoss
>

self intersecting polygon errors

2012-12-05 Thread jend

 

Hi, trying to play with polygon searches and have noticed this issue. It
seems when you cross a line you invalidate the polygon (Sorry I have no idea
about the technical term) and solr cannot handle it.

As far as my google fu goes, there is no fix, its not specific to solr, its
to do with geometry moreso.

Anyone have a work around? Could this be fixed or should I just try and
control this at the customer front end script to check if the shape is
valid.

Using Solr4 and I think current version of JTS (downloaded/installed in mid
November).

Thanks



Self-intersection at or near point (128.93654657968983, -24.34876200980167,
NaN)


com.spatial4j.core.exception.InvalidShapeException: Self-intersection at or
near point (128.93654657968983, -24.34876200980167, NaN) at
com.spatial4j.core.shape.jts.JtsGeometry.(JtsGeometry.java:90) at
com.spatial4j.core.io.JtsShapeReadWriter.readShape(JtsShapeReadWriter.java:93)
at
com.spatial4j.core.context.SpatialContext.readShape(SpatialContext.java:195)
at
org.apache.lucene.spatial.query.SpatialArgsParser.parse(SpatialArgsParser.java:89)
at
org.apache.solr.schema.AbstractSpatialFieldType.getFieldQuery(AbstractSpatialFieldType.java:175)
at
org.apache.solr.search.SolrQueryParser.getFieldQuery(SolrQueryParser.java:171)
at
org.apache.lucene.queryparser.classic.QueryParserBase.getFieldQuery(QueryParserBase.java:657)
at
org.apache.lucene.queryparser.classic.QueryParserBase.handleQuotedTerm(QueryParserBase.java:1082)
at
org.apache.lucene.queryparser.classic.QueryParser.Term(QueryParser.java:462)
at
org.apache.lucene.queryparser.classic.QueryParser.Clause(QueryParser.java:257)
at
org.apache.lucene.queryparser.classic.QueryParser.Query(QueryParser.java:181)
at
org.apache.lucene.queryparser.classic.QueryParser.TopLevelQuery(QueryParser.java:170)
at
org.apache.lucene.queryparser.classic.QueryParserBase.parse(QueryParserBase.java:120)
at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:72)
at org.apache.solr.search.QParser.getQuery(QParser.java:143) at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:137)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:185)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351) at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634) at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230) at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:722)

500




--
View this message in context: 
http://lucene.472066.n3.nabble.com/self-intersecting-polygon-errors-tp4024423.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Maximum number of cores

2012-12-05 Thread Majirus FANSI

Hi,
Please read the following page about solr and lots of cores:
http://wiki.apache.org/solr/LotsOfCores.
Cheers,

Maj


On 5 December 2012 07:01, Otis Gospodnetic wrote:

> Hi,
>
> It depends on your resources.  The other day somebody mentioned having
> 5 cores.
>
> Otis
> --
> SOLR Performance Monitoring - http://sematext.com/spm
> On Dec 5, 2012 12:47 AM, "S_Chawla"  wrote:
>
> > Hi,
> > I am using solr4.0, i have created 10 cores in solr. I want to know how
> > many
> > maximum number of cores can be created in solr.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Maximum-number-of-cores-tp4024398.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>

Re: ids parameter decides the result set and order, no matter what kind of query you enter

2012-12-05 Thread deniz

replying my own question. ids is used in responsebuilders some internal
mapping structure which is used for sorting and reordering document list
before it is shown to the end user... simply it stores unique field values
from documents which are to be shown to the user, and each mapping of these
ids are actual documents which were gathered from other shards, including
their positions on shards and their future position in merged resultset





-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/ids-parameter-decides-the-result-set-and-order-no-matter-what-kind-of-query-you-enter-tp4024390p4024412.html
Sent from the Solr - User mailing list archive at Nabble.com.

75 matches

Mail list logo