Re: Error Compiling the Source ExtractRequestHandler

2009-09-25 Thread busbus



 Where/How did you get the source for the ExtractingRequestHandler? 
 

http://svn.apache.org/repos/asf/lucene/solr/trunk/contrib/extraction.

Got all source files(.java) files from this link.

And am missing the Parent Class which the extractingRequestHandler is
extending.
-- 
View this message in context: 
http://www.nabble.com/Error-Compiling-the-Source-ExtractRequestHandler-tp25531022p25607118.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



[jira] Updated: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-1458:
-

Attachment: SOLR-1458.patch

added 2 new methods to  IndexDeletionPolicyWrapper. 
{code:java}
public synchronized void reserveCommitPoint(Long indexCommitVersion)

  public synchronized void releaseCommmitPoint(Long indexCommitVersion)
{code}

every commit point held by ReplicationHandler should be reserved for ever till 
it is  released

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in solrcore.properties:
 {code}
 enable.slave=true
 

[jira] Commented: (SOLR-1449) solrconfig.xml syntax to add classpath elements from outside of instanceDir

2009-09-25 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759409#action_12759409
 ] 

Noble Paul commented on SOLR-1449:
--

I have a few questions in mind. 
* Is this an issue which users have reported? in my experience with Solr 
mailing list, I am yet to see a request  where users wish to add arbitrary 
directories to classpath
* How important is this feature to be in 1.4?
 I am not aware of any project which allows this level of configurability for 
classpath. Most of the users never have to write custom components for Solr. In 
our organization, I have encountered very few cases where they needed to add 
custom jars to classpath. Even in cases where they did , they were some trivial 
jars.

I am -1 on adding this to 1.4.



 solrconfig.xml syntax to add classpath elements from outside of instanceDir
 ---

 Key: SOLR-1449
 URL: https://issues.apache.org/jira/browse/SOLR-1449
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Fix For: 1.4

 Attachments: SOLR-1449.patch, SOLR-1449.patch


 the idea has been discussed numerous times that it would be nice if there was 
 a way to configure a core to load plugins from specific jars (or classes 
 style directories) by path  w/o needing to copy them to the ./lib dir in 
 the instanceDir.
 The current workaround is symlinks but that doesn't really help the 
 situation of the Solr Release artifacts, where we wind up making numerous 
 copies of jars to support multiple example directories (you can't have 
 reliable symlinks in zip files)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1292) show lucene fieldcache entries and sizes

2009-09-25 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759410#action_12759410
 ] 

Shalin Shekhar Mangar commented on SOLR-1292:
-

bq. I did an open call hierarchy on UnInvertedField.memSize, which is called by 
toString which isn't called by anything so I assume it never makes it to the 
web UI?

Jason is right. The UninvertedField's memSize never shows up on the statistics 
page. I'll open another issue to fix this.


 show lucene fieldcache entries and sizes
 

 Key: SOLR-1292
 URL: https://issues.apache.org/jira/browse/SOLR-1292
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1292.patch


 See LUCENE-1749, FieldCache introspection API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1449) solrconfig.xml syntax to add classpath elements from outside of instanceDir

2009-09-25 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759409#action_12759409
 ] 

Noble Paul edited comment on SOLR-1449 at 9/25/09 12:02 AM:


I have a few questions in mind. 
* Is this an issue which users have reported? in my experience with Solr 
mailing list, I am yet to see a request  where users wish to add arbitrary 
directories to classpath
* How important is this feature to be in 1.4?
* Users in general have a lot of problems with classloading. Even with the 
current support with one lib directory I have seen so many users having trouble 
with classloading . This can only add to that confusion

 I am not aware of any project which allows this level of configurability for 
classpath. Most of the users never have to write custom components for Solr. In 
our organization, I have encountered very few cases where they needed to add 
custom jars to classpath. Even in cases where they did , they were some trivial 
jars and it can be put into solr_home/lib anyway.

I am -1 on adding this to 1.4.



  was (Author: noble.paul):
I have a few questions in mind. 
* Is this an issue which users have reported? in my experience with Solr 
mailing list, I am yet to see a request  where users wish to add arbitrary 
directories to classpath
* How important is this feature to be in 1.4?
 I am not aware of any project which allows this level of configurability for 
classpath. Most of the users never have to write custom components for Solr. In 
our organization, I have encountered very few cases where they needed to add 
custom jars to classpath. Even in cases where they did , they were some trivial 
jars.

I am -1 on adding this to 1.4.


  
 solrconfig.xml syntax to add classpath elements from outside of instanceDir
 ---

 Key: SOLR-1449
 URL: https://issues.apache.org/jira/browse/SOLR-1449
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Fix For: 1.4

 Attachments: SOLR-1449.patch, SOLR-1449.patch


 the idea has been discussed numerous times that it would be nice if there was 
 a way to configure a core to load plugins from specific jars (or classes 
 style directories) by path  w/o needing to copy them to the ./lib dir in 
 the instanceDir.
 The current workaround is symlinks but that doesn't really help the 
 situation of the Solr Release artifacts, where we wind up making numerous 
 copies of jars to support multiple example directories (you can't have 
 reliable symlinks in zip files)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-09-25 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759415#action_12759415
 ] 

Bill Bell commented on SOLR-773:


Brad,

Were you able to complete your patch?

You commented:
Brad Giaccio added a comment - 12/Aug/09 04:08 PM
I'm going to have to disagree with Chris's asserion that more than a 
functionQuery is needed, I have a functionQuery that simply starts by getting a 
TermEnum that starts with the minimum latitude that can possibily match your 
spatial extent, and exits when it gets to the max lat. This way I take 
advantage of the lexical searching of the strings, and then only have to 
compute distances for things that are in the box.

This code runs at sub second on a shard of 12 million documents, actually its 
subsecond hitting 8 shards of 12mil each.

Just a thought? If interested I have a searchComponent that makes use of this 
filter I can attach


 Incorporate Local Lucene/Solr
 -

 Key: SOLR-773
 URL: https://issues.apache.org/jira/browse/SOLR-773
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
 lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, 
 SOLR-773.patch, spatial-solr.tar.gz


 Local Lucene has been donated to the Lucene project.  It has some Solr 
 components, but we should evaluate how best to incorporate it into Solr.
 See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1461) Expose UnInvertedField's memory usage in statistics page

2009-09-25 Thread Shalin Shekhar Mangar (JIRA)
Expose UnInvertedField's memory usage in statistics page


 Key: SOLR-1461
 URL: https://issues.apache.org/jira/browse/SOLR-1461
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


UnInvertedField has a memSize method but the memory usage is never shown on the 
statistics page.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1292) show lucene fieldcache entries and sizes

2009-09-25 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759419#action_12759419
 ] 

Shalin Shekhar Mangar commented on SOLR-1292:
-

I opened SOLR-1461

 show lucene fieldcache entries and sizes
 

 Key: SOLR-1292
 URL: https://issues.apache.org/jira/browse/SOLR-1292
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
Assignee: Hoss Man
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1292.patch


 See LUCENE-1749, FieldCache introspection API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1449) solrconfig.xml syntax to add classpath elements from outside of instanceDir

2009-09-25 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759457#action_12759457
 ] 

Noble Paul commented on SOLR-1449:
--

Let us get the facts. We may be barking up the wrong tree.

we have a 116MB distribution out of which the solr.war + single+multicore 
example is around 4MB .  Most of the users need this 4MB only (99% of the users 
do not need clustering + solr cell ). If we implement this issue we may cut 
down the size of the distro by around 20 MB (by eliminating duplication of tika 
jars).  What we should have is a lighter version (solr.war + example solr home 
. 4MB and a full version. 

I am sure most of the users will be happy with the minimal solr. The rest of 
them will happily download the whole thing however big it is.

This is not to say that we don't need to reduce the size of the distro. But 
adding complexity just for this is not really required. Just adding a .sh and 
.bat file to the tika example we can add jars from external path.

 solrconfig.xml syntax to add classpath elements from outside of instanceDir
 ---

 Key: SOLR-1449
 URL: https://issues.apache.org/jira/browse/SOLR-1449
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Fix For: 1.4

 Attachments: SOLR-1449.patch, SOLR-1449.patch


 the idea has been discussed numerous times that it would be nice if there was 
 a way to configure a core to load plugins from specific jars (or classes 
 style directories) by path  w/o needing to copy them to the ./lib dir in 
 the instanceDir.
 The current workaround is symlinks but that doesn't really help the 
 situation of the Solr Release artifacts, where we wind up making numerous 
 copies of jars to support multiple example directories (you can't have 
 reliable symlinks in zip files)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1449) solrconfig.xml syntax to add classpath elements from outside of instanceDir

2009-09-25 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759444#action_12759444
 ] 

Erik Hatcher commented on SOLR-1449:


I haven't tried out Hoss' patch, but based on the write-up, I'm +1.

Having a configurable list of directories to load from will clean up our 
examples, reduce the size (or installation steps) of our example app.  It will 
allow various plugins to live in one place and be referred to without having to 
copy files all over the place.

I think this is an important feature to get into 1.4.

 solrconfig.xml syntax to add classpath elements from outside of instanceDir
 ---

 Key: SOLR-1449
 URL: https://issues.apache.org/jira/browse/SOLR-1449
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
 Fix For: 1.4

 Attachments: SOLR-1449.patch, SOLR-1449.patch


 the idea has been discussed numerous times that it would be nice if there was 
 a way to configure a core to load plugins from specific jars (or classes 
 style directories) by path  w/o needing to copy them to the ./lib dir in 
 the instanceDir.
 The current workaround is symlinks but that doesn't really help the 
 situation of the Solr Release artifacts, where we wind up making numerous 
 copies of jars to support multiple example directories (you can't have 
 reliable symlinks in zip files)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1311) pseudo-field-collapsing

2009-09-25 Thread Marc Sturlese (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759489#action_12759489
 ] 

Marc Sturlese commented on SOLR-1311:
-

Well, the thing is my patch is very good in performance because by now it can 
not be integrated as a plugin. Field collaping patch does 2 searches. One to 
pick the ids to collapse and the second to filter the ids in the main search.
What I do is to pseudo-collapse straight in the mian search... reordering the 
ids in the getDocListAndSetNC and getDocListNC so response times are almost the 
same with or without the patch. 

 pseudo-field-collapsing
 ---

 Key: SOLR-1311
 URL: https://issues.apache.org/jira/browse/SOLR-1311
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Marc Sturlese
 Fix For: 1.5

 Attachments: SOLR-1311-pseudo-field-collapsing.patch


 I am trying to develope a new way of doing field collapsing based on the 
 adjacent field collapsing algorithm. I have started developing it beacuse I 
 am experiencing performance problems with the field collapsing patch with big 
 index (8G).
 The algorith does adjacent-pseudo-field collapsing. It does collapsing on the 
 first X documents. Instead of making the collapsed docs disapear, the 
 algorith will send them to a given position of the relevance results list.
 The reason I just do collapsing in the first X documents is that if I have 
 for example 60 results and I am showing 10 results per page, I really 
 don't need to do collapsing in the page 3 or even not in the 3000. Doing 
 this I am noticing dramatically better performance. The problem is I couldn't 
 find a way to plug the algorithm as a component and keep good performance. I 
 had to hack few classes in SolrIndexSearcher.java
 This patch is just experimental and for testing purposes. In case someone 
 finds it interesting would be good do find a way to integrate it in a better 
 way than it is at the moment.
 Advices are more than welcome.
   
 Functionality:
 In solrconfig.xml we specify the pseudo-collapsing parameters:
  str name=plus.considerMoreDocstrue/str
  str name=plus.considerHowMany3000/str
  str name=plus.considerFieldname/str
 (at the moment there's no threshold and other parameters that exist in the 
 current collapse-field patch)
 plus.considerMoreDocs one enables pseudo-collapsing
 plus.considerHowMany sets the number of resultant documents in wich we want 
 to apply the algorithm
 plus.considerField is the field to do pseudo-collapsing
 If the number of results is lower than plus.considerHowMany the algorithm 
 will be applyed to all the results.
 Let's say there is a query with 60 results and we've set considerHowMany 
 to 3000 (and we already have the docs sorted by relevance). 
 What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it 
 will be sent to the pos 2999 of the relevance results array. If the 3th has 
 to be collpased too  will go to the position 2998 and successively like this.
 The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs 
 is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder.
 Example with a query of 9 results:
 Results sorted by relevance without pseudo-collapse-algorithm:
 doc1 - collapse_field_value 3
 doc2 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 5
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc2 - collapse_field_value 3*
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 9
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 doc6 - collapse_field_value 6*
 doc2 - collapse_field_value 3*
 *pseudo-collapsed documents

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1461) Expose UnInvertedField's memory usage in statistics page

2009-09-25 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759504#action_12759504
 ] 

Shalin Shekhar Mangar commented on SOLR-1461:
-

Should we have UnInvertedField itself implement SolrInfoMBean? I can't see any 
other way of exposing this bit.

 Expose UnInvertedField's memory usage in statistics page
 

 Key: SOLR-1461
 URL: https://issues.apache.org/jira/browse/SOLR-1461
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


 UnInvertedField has a memSize method but the memory usage is never shown on 
 the statistics page.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Error Compiling the Source ExtractRequestHandler

2009-09-25 Thread Grant Ingersoll


On Sep 25, 2009, at 2:23 AM, busbus wrote:






Where/How did you get the source for the ExtractingRequestHandler?



http://svn.apache.org/repos/asf/lucene/solr/trunk/contrib/extraction.

Got all source files(.java) files from this link.

And am missing the Parent Class which the extractingRequestHandler is
extending.


You need to get trunk, too, i.e. http://svn.apache.org/repos/asf/lucene/solr/trunk/ 
 and compile that first.


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



[jira] Commented: (SOLR-1461) Expose UnInvertedField's memory usage in statistics page

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759532#action_12759532
 ] 

Yonik Seeley commented on SOLR-1461:


The memory usage of UnInvertedField is shown... just not in the same place as 
the Lucene field cache.
In statistics, look under fieldValueCache.

 Expose UnInvertedField's memory usage in statistics page
 

 Key: SOLR-1461
 URL: https://issues.apache.org/jira/browse/SOLR-1461
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


 UnInvertedField has a memSize method but the memory usage is never shown on 
 the statistics page.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1462) DIH won't run script transormer anymore. Complains I'm not running java 6

2009-09-25 Thread Edward Rudd (JIRA)
DIH won't run script transormer anymore.  Complains I'm not running java 6
--

 Key: SOLR-1462
 URL: https://issues.apache.org/jira/browse/SOLR-1462
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
 Environment: CentOS 5.3 with Java-1.6.0-openjdk-1.6.0.0-1.2.b09.el5 
(this version has been installed since August)
Reporter: Edward Rudd


Before a reboot 2 weeks ago DIH worked fine, but now constantly returns this 
error anytime an import is used.  Any clues how to diagnose what is going on?


org.apache.solr.handler.dataimport.DataImportHandlerException: script can be 
used only in java 6 or above Processing Document # 1
at 
org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:87)
at 
org.apache.solr.handler.dataimport.ScriptTransformer.transformRow(ScriptTransformer.java:50)
at 
org.apache.solr.handler.dataimport.DebugLogger$3.transformRow(DebugLogger.java:211)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.applyTransformer(EntityProcessorWrapper.java:195)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextModifiedRowKey(EntityProcessorWrapper.java:252)
at 
org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:607)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:245)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:159)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:354)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:395)
at 
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:203)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at 
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
Caused by: java.lang.NullPointerException
at 
org.apache.solr.handler.dataimport.ScriptTransformer.initEngine(ScriptTransformer.java:82)
... 31 more


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759546#action_12759546
 ] 

Yonik Seeley commented on SOLR-1458:


Shouldn't there be some kind of option on the deletion policy... say 
keepLastOptimized?
Then the ReplicationHandler would only have to flip it on (if it weren't 
already on).  It doesn't seem like the ReplicationHandler should be the one to 
pick which commit points to reserve forever.

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in solrcore.properties:
 {code}
 enable.slave=true
 

[jira] Resolved: (SOLR-1461) Expose UnInvertedField's memory usage in statistics page

2009-09-25 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-1461.
---

Resolution: Won't Fix

 Expose UnInvertedField's memory usage in statistics page
 

 Key: SOLR-1461
 URL: https://issues.apache.org/jira/browse/SOLR-1461
 Project: Solr
  Issue Type: Improvement
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.4


 UnInvertedField has a memSize method but the memory usage is never shown on 
 the statistics page.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1448) Addition of weblogic.xml required for solr to run under weblogic 10.3

2009-09-25 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-1448.
---

Resolution: Won't Fix

I don't see a need to put specific files for specific containers into Solr.  
See http://wiki.apache.org/solr/SolrWeblogic.

 Addition of weblogic.xml required for solr to run under weblogic 10.3
 -

 Key: SOLR-1448
 URL: https://issues.apache.org/jira/browse/SOLR-1448
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
 Environment: Weblogic 10.3
Reporter: Ilan Rabinovitch
Priority: Minor
 Fix For: 1.4

 Attachments: weblogic.xml


 Weblogic appears to have filters enabled even on FORWARD, which is listed as 
 something that will not function properly in the Solr documentation. As a 
 result, the administrative application generates a StackOverflow when 
 accessed. 
 This can be resolved by adding the attached weblogic.xml file to solr.  No 
 other changes are required.
 ?xml version='1.0' encoding='UTF-8'?
 weblogic-web-app
 xmlns=http://www.bea.com/ns/weblogic/90;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://www.bea.com/ns/weblogic/90 
 http://www.bea.com/ns/weblogic/90/weblogic-web-app.xsd;
 container-descriptor
 
 filter-dispatched-requests-enabledfalse/filter-dispatched-requests-enabled
 /container-descriptor
 /weblogic-web-app

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1314) Upgrade Carrot2 to version 3.1.0

2009-09-25 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759549#action_12759549
 ] 

Grant Ingersoll commented on SOLR-1314:
---

Hi Stazek,

Now that Lucene is final, can we finalize the jar for this one?  Also, this 
final JAR will handle the license and FastVector stuff, right?

Thanks,
Grant

 Upgrade Carrot2 to version 3.1.0
 

 Key: SOLR-1314
 URL: https://issues.apache.org/jira/browse/SOLR-1314
 Project: Solr
  Issue Type: Task
Reporter: Stanislaw Osinski
Assignee: Grant Ingersoll
 Fix For: 1.4


 As soon as Lucene 2.9 is releases, Carrot2 3.1.0 will come out with bug fixes 
 in clustering algorithms and improved clustering in Chinese. The upgrade 
 should be a matter of upgrading {{carrot2-mini.jar}} and 
 {{google-collections.jar}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: [PMX:FAKE_SENDER] Re: large OR-boolean query

2009-09-25 Thread Luo, Jeff
We are searching strings, not numbers. The reason we are doing this kind
of query is that we have two big indexes, say, a collection of medicine
drugs and a collection of research papers. I first run a query against
the drugs index and get 102400 unique drug names back. Then I need to
find all the research papers where one or more of the 102400 drug names
are mentioned, hence the large OR query. This is a kind of JOIN query
between 2 indexes, which an article in the lucid web site comparing
databases and search engines briefly touched.

I was able to issue 100 parallel small queries against solr shards and
get the results back successfully (even sorted). My custom code is less
than 100 lines, mostly in my SearchHandler.handleRequestBody. But I have
problem summing up the correct facet counts because the faceting counts
from each shard are not disjunctive.

Based on what is suggested by two other responses to my question, I
think it is possible that the master can pass the original large query
to each shard, and each shard will split the large query into 100 lower
level disjunctive lucene queries, fire them against its Lucene index in
a parallel way and merge the results. Then each shard shall only return
1(instead of 100) result set to the master with disjunctive faceting
counts. It seems that the faceting problem can be solved in this way. I
would appreciate it if you could let me know if this approach is
feasible and correct; what solr plug-ins are needed(my guess is a custom
parser and query-component?)

Thanks,

Jeff   



-Original Message-
From: Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: Thursday, September 24, 2009 10:01 AM
To: solr-dev@lucene.apache.org
Subject: [PMX:FAKE_SENDER] Re: large OR-boolean query


On Sep 23, 2009, at 4:26 PM, Luo, Jeff wrote:

 Hi,

 We are experimenting a parallel approach to issue a large OR-Boolean
 query, e.g., keywords:(1 OR 2 OR 3 OR ... OR 102400), against several
 solr shards.

 The way we are trying is to break the large query into smaller ones,
 e.g.,
 the example above can be broken into 10 small queries: keywords:(1  
 OR 2
 OR 3 OR ... OR 1024), keywords:(1025 OR 1026 OR 1027 OR ... OR 2048),
 etc

 Now each shard will get 10 requests and the master will merge the
 results coming back from each shard, similar to the regular  
 distributed
 search.


Can you tell us a little bit more about the why/what of this?  Are you  
really searching numbers or are those just for example?  Do you care  
about the score or do you just need to know whether the result is  
there or not?


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search



[jira] Commented: (SOLR-1294) SolrJS/Javascript client fails in IE8!

2009-09-25 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759555#action_12759555
 ] 

Grant Ingersoll commented on SOLR-1294:
---

Alex,

How about instead of forking, you create a patch?  I don't see the point of 
maintaining it elsewhere, esp. under the ASL 2.0

 SolrJS/Javascript client fails in IE8!
 --

 Key: SOLR-1294
 URL: https://issues.apache.org/jira/browse/SOLR-1294
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.4
Reporter: Eric Pugh
Assignee: Ryan McKinley
 Fix For: 1.4

 Attachments: SOLR-1294-IE8.patch, SOLR-1294.patch, 
 solrjs-ie8-html-syntax-error.patch


 SolrJS seems to fail with 'jQuery.solrjs' is null or not an object errors 
 under IE8.  I am continuing to test if this occurs in IE 6 and 7 as well.  
 This happens on both the Sample online site at 
 http://solrjs.solrstuff.org/test/reuters/ as well as the 
 /trunk/contrib/javascript library.   Seems to be a show stopper from the 
 standpoint of really using this library!
 I have posted a screenshot of the error at 
 http://img.skitch.com/20090717-jejm71u6ghf2dpn3mwrkarigwm.png
 The error is just a whole bunch of repeated messages in the vein of:
 Message: 'jQuery.solrjs' is null or not an object
 Line: 24
 Char: 1
 Code: 0
 URI: file:///C:/dev/projects/lib/solr/contrib/javascript/src/core/QueryItem.js
 Message: 'jQuery.solrjs' is null or not an object
 Line: 37
 Char: 1
 Code: 0
 URI: file:///C:/dev/projects/lib/solr/contrib/javascript/src/core/Manager.js
 Message: 'jQuery.solrjs' is null or not an object
 Line: 24
 Char: 1
 Code: 0
 URI: 
 file:///C:/dev/projects/lib/solr/contrib/javascript/src/core/AbstractSelectionView.js
 Message: 'jQuery.solrjs' is null or not an object
 Line: 27
 Char: 1
 Code: 0
 URI: 
 file:///C:/dev/projects/lib/solr/contrib/javascript/src/core/AbstractWidget.js

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [PMX:FAKE_SENDER] Re: large OR-boolean query

2009-09-25 Thread Walter Underwood
This would work a lot better if you did the join at index time. For  
each paper, add a field with all the related drug names (or whatever  
you want to search for), then search on that field.


With the current design, it will never be fast and never scale. Each  
lookup has a cost, so expanding a query to a thousand terms will  
always be slow. Distributing the query to multiple shards will only  
make a bad design slightly faster.


This is fundamental to search index design. The schema is flat, fully- 
denormalized, no joins. You tag each document with the terms that you  
will use to find it. Then you search for those terms directly.


wunder

On Sep 25, 2009, at 7:52 AM, Luo, Jeff wrote:

We are searching strings, not numbers. The reason we are doing this  
kind
of query is that we have two big indexes, say, a collection of  
medicine

drugs and a collection of research papers. I first run a query against
the drugs index and get 102400 unique drug names back. Then I need to
find all the research papers where one or more of the 102400 drug  
names

are mentioned, hence the large OR query. This is a kind of JOIN query
between 2 indexes, which an article in the lucid web site comparing
databases and search engines briefly touched.

I was able to issue 100 parallel small queries against solr shards and
get the results back successfully (even sorted). My custom code is  
less
than 100 lines, mostly in my SearchHandler.handleRequestBody. But I  
have
problem summing up the correct facet counts because the faceting  
counts

from each shard are not disjunctive.

Based on what is suggested by two other responses to my question, I
think it is possible that the master can pass the original large query
to each shard, and each shard will split the large query into 100  
lower
level disjunctive lucene queries, fire them against its Lucene index  
in
a parallel way and merge the results. Then each shard shall only  
return

1(instead of 100) result set to the master with disjunctive faceting
counts. It seems that the faceting problem can be solved in this  
way. I

would appreciate it if you could let me know if this approach is
feasible and correct; what solr plug-ins are needed(my guess is a  
custom

parser and query-component?)

Thanks,

Jeff



-Original Message-
From: Grant Ingersoll [mailto:gsing...@apache.org]
Sent: Thursday, September 24, 2009 10:01 AM
To: solr-dev@lucene.apache.org
Subject: [PMX:FAKE_SENDER] Re: large OR-boolean query


On Sep 23, 2009, at 4:26 PM, Luo, Jeff wrote:


Hi,

We are experimenting a parallel approach to issue a large OR-Boolean
query, e.g., keywords:(1 OR 2 OR 3 OR ... OR 102400), against several
solr shards.

The way we are trying is to break the large query into smaller ones,
e.g.,
the example above can be broken into 10 small queries: keywords:(1
OR 2
OR 3 OR ... OR 1024), keywords:(1025 OR 1026 OR 1027 OR ... OR 2048),
etc

Now each shard will get 10 requests and the master will merge the
results coming back from each shard, similar to the regular
distributed
search.



Can you tell us a little bit more about the why/what of this?  Are you
really searching numbers or are those just for example?  Do you care
about the score or do you just need to know whether the result is
there or not?


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search





Re: [PMX:FAKE_SENDER] Re: large OR-boolean query

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 10:52 AM, Luo, Jeff j...@cas.org wrote:
 Based on what is suggested by two other responses to my question, I
 think it is possible that the master can pass the original large query
 to each shard, and each shard will split the large query into 100 lower
 level disjunctive lucene queries, fire them against its Lucene index in
 a parallel way and merge the results. Then each shard shall only return
 1(instead of 100) result set to the master with disjunctive faceting
 counts. It seems that the faceting problem can be solved in this way. I
 would appreciate it if you could let me know if this approach is
 feasible and correct; what solr plug-ins are needed(my guess is a custom
 parser and query-component?)

A custom query type that does this big parallel OR, and a
QParserPlugin that creates that query.  No changes to any search
components should be needed.

-Yonik
http://www.lucidimagination.com


[jira] Created: (SOLR-1463) Materialize Clusters as Filters for use in restricting future results

2009-09-25 Thread Grant Ingersoll (JIRA)
Materialize Clusters as Filters for use in restricting future results
-

 Key: SOLR-1463
 URL: https://issues.apache.org/jira/browse/SOLR-1463
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Priority: Minor


Given a set of clusters (either based on search result or document clustering), 
seed the filter cache with the cluster results as filters so that future 
searches could be directed to search only a given cluster.  Not sure yet how 
this would fit into the existing filter capabilities.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759611#action_12759611
 ] 

Yonik Seeley commented on SOLR-1336:


I guess it should go into contrib for now...
bq. where should i put factories?

It would be nice if we could avoid another jar, just for 2 small classes.
Perhaps we could make them lazy load?  token streams are reused now, so a small 
reflection overhead is no longer an issue.


 Add support for lucene's SmartChineseAnalyzer
 -

 Key: SOLR-1336
 URL: https://issues.apache.org/jira/browse/SOLR-1336
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Reporter: Robert Muir
 Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch


 SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese 
 text as words.
 if the factories for the tokenizer and word token filter are added to solr it 
 can be used, although there should be a sample config or wiki entry showing 
 how to apply the built-in stopwords list.
 this is because it doesn't contain actual stopwords, but must be used to 
 prevent indexing punctuation... 
 note: we did some refactoring/cleanup on this analyzer recently, so it would 
 be much easier to do this after the next lucene update.
 it has also been moved out of -analyzers.jar due to size, and now builds in 
 its own smartcn jar file, so that would need to be added if this feature is 
 desired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer

2009-09-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759616#action_12759616
 ] 

Robert Muir commented on SOLR-1336:
---

{quote}
Perhaps we could make them lazy load? token streams are reused now, so a small 
reflection overhead is no longer an issue.
{quote}

If we do this, then we could avoid a contrib that is really just a jar file? 
and instead could the jar file just go in the example/solr/lib?


 Add support for lucene's SmartChineseAnalyzer
 -

 Key: SOLR-1336
 URL: https://issues.apache.org/jira/browse/SOLR-1336
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Reporter: Robert Muir
 Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch


 SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese 
 text as words.
 if the factories for the tokenizer and word token filter are added to solr it 
 can be used, although there should be a sample config or wiki entry showing 
 how to apply the built-in stopwords list.
 this is because it doesn't contain actual stopwords, but must be used to 
 prevent indexing punctuation... 
 note: we did some refactoring/cleanup on this analyzer recently, so it would 
 be much easier to do this after the next lucene update.
 it has also been moved out of -analyzers.jar due to size, and now builds in 
 its own smartcn jar file, so that would need to be added if this feature is 
 desired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759625#action_12759625
 ] 

Yonik Seeley commented on SOLR-1336:


In theory perhaps, but one problem is that example/solr/lib isn't even in 
svn... nothing lives there, but is copied there (currently).
There's been a lot of discussions on solr-dev lately about where the tika libs 
should live, etc... 
http://search.lucidimagination.com/search/document/a9520632864db021/distinct_example_for_solr_cell
And SOLR-1449 is also in the mix as a way to reference jars outside of the 
example lib.

 Add support for lucene's SmartChineseAnalyzer
 -

 Key: SOLR-1336
 URL: https://issues.apache.org/jira/browse/SOLR-1336
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Reporter: Robert Muir
 Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch


 SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese 
 text as words.
 if the factories for the tokenizer and word token filter are added to solr it 
 can be used, although there should be a sample config or wiki entry showing 
 how to apply the built-in stopwords list.
 this is because it doesn't contain actual stopwords, but must be used to 
 prevent indexing punctuation... 
 note: we did some refactoring/cleanup on this analyzer recently, so it would 
 be much easier to do this after the next lucene update.
 it has also been moved out of -analyzers.jar due to size, and now builds in 
 its own smartcn jar file, so that would need to be added if this feature is 
 desired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer

2009-09-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759630#action_12759630
 ] 

Robert Muir commented on SOLR-1336:
---

Yonik maybe it would be better to wait until these things settle out first? (I 
glanced at issues and saw -1, +1, and such)

I guess there is always the option for release 1.4, do nothing, and instruct 
users that want to use this analyzer to put lucene-smartcn-2.9.jar in their lib 
and use analyzer= (they will be stuck with porter stemming and such for now 
though)


 Add support for lucene's SmartChineseAnalyzer
 -

 Key: SOLR-1336
 URL: https://issues.apache.org/jira/browse/SOLR-1336
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Reporter: Robert Muir
 Attachments: SOLR-1336.patch, SOLR-1336.patch, SOLR-1336.patch


 SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese 
 text as words.
 if the factories for the tokenizer and word token filter are added to solr it 
 can be used, although there should be a sample config or wiki entry showing 
 how to apply the built-in stopwords list.
 this is because it doesn't contain actual stopwords, but must be used to 
 prevent indexing punctuation... 
 note: we did some refactoring/cleanup on this analyzer recently, so it would 
 be much easier to do this after the next lucene update.
 it has also been moved out of -analyzers.jar due to size, and now builds in 
 its own smartcn jar file, so that would need to be added if this feature is 
 desired.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1314) Upgrade Carrot2 to version 3.1.0

2009-09-25 Thread Stanislaw Osinski (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759667#action_12759667
 ] 

Stanislaw Osinski commented on SOLR-1314:
-

Hi Grant,

bq. Now that Lucene is final, can we finalize the jar for this one? 

Sure, over the weekend we'll be making an official Carrot2 3.1.0 release. As 
part of that process I'll check if the Solr plugin is working fine and will 
post the final JAR here.

bq. Also, this final JAR will handle the license and FastVector stuff, right?

Correct. The following commit removed it from trunk and hence the 3.1.0 release:

http://fisheye3.atlassian.com/changelog/carrot2/?cs=3694

S.

 Upgrade Carrot2 to version 3.1.0
 

 Key: SOLR-1314
 URL: https://issues.apache.org/jira/browse/SOLR-1314
 Project: Solr
  Issue Type: Task
Reporter: Stanislaw Osinski
Assignee: Grant Ingersoll
 Fix For: 1.4


 As soon as Lucene 2.9 is releases, Carrot2 3.1.0 will come out with bug fixes 
 in clustering algorithms and improved clustering in Chinese. The upgrade 
 should be a matter of upgrading {{carrot2-mini.jar}} and 
 {{google-collections.jar}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759671#action_12759671
 ] 

Yonik Seeley commented on SOLR-1458:


Looking into this more, I think this should be the deletion policy that keeps 
around the last optimized commit point if necessary.
Also, in checking out SolrDeletionPolicy again, it doesn't seem like the 
maxCommitsToKeep logic will work if keepOptimizedOnly is true.
I'm going to take a whack at rewriting updateCommits()

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in 

[jira] Updated: (SOLR-1457) Deploy shards from HDFS into local cores

2009-09-25 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated SOLR-1457:
---

Attachment: hadoop-0.19.0-core.jar
SOLR-1475.patch

There's a DFSCoreAdminHandler that enables a new dfsdeploy action that copies a 
path out of HDFS onto the local system.  The test case shows this working.  

 Deploy shards from HDFS into local cores
 

 Key: SOLR-1457
 URL: https://issues.apache.org/jira/browse/SOLR-1457
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: hadoop-0.19.0-core.jar

   Original Estimate: 72h
  Remaining Estimate: 72h

 This will be an interim utility (until Katta integration
 SOLR-1395 becomes more functional) that allows deployment of
 multiple sharded indexes in HDFS onto a local Solr server. To
 make it easy, I'd run it remotely via SSH so that one doesn't
 have to manually execute it per machine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1457) Deploy shards from HDFS into local cores

2009-09-25 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated SOLR-1457:
---

Attachment: (was: SOLR-1475.patch)

 Deploy shards from HDFS into local cores
 

 Key: SOLR-1457
 URL: https://issues.apache.org/jira/browse/SOLR-1457
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: hadoop-0.19.0-core.jar

   Original Estimate: 72h
  Remaining Estimate: 72h

 This will be an interim utility (until Katta integration
 SOLR-1395 becomes more functional) that allows deployment of
 multiple sharded indexes in HDFS onto a local Solr server. To
 make it easy, I'd run it remotely via SSH so that one doesn't
 have to manually execute it per machine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1167) Support module xml config files using XInclude

2009-09-25 Thread Bryan Talbot (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Talbot updated SOLR-1167:
---

Fix Version/s: 1.4

The patch is for the trunk, currently 1.4 

 Support module xml config files using XInclude
 --

 Key: SOLR-1167
 URL: https://issues.apache.org/jira/browse/SOLR-1167
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
Reporter: Bryan Talbot
Priority: Minor
 Fix For: 1.4

 Attachments: SOLR-1167.patch, SOLR-1167.patch, SOLR-1167.patch


 Current configuration files (schema and solrconfig) are monolithic which can 
 make maintenance and reuse more difficult that it needs to be.  The XML 
 standards include a feature to include content from external files.  This is 
 described at http://www.w3.org/TR/xinclude/
 This feature is to add support for XInclude features for XML configuration 
 files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1457) Deploy shards from HDFS into local cores

2009-09-25 Thread Jason Rutherglen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated SOLR-1457:
---

Description: This issue extends CoreAdminHandler to allow installation of 
new cores from HDFS.  (was: This will be an interim utility (until Katta 
integration
SOLR-1395 becomes more functional) that allows deployment of
multiple sharded indexes in HDFS onto a local Solr server. To
make it easy, I'd run it remotely via SSH so that one doesn't
have to manually execute it per machine.)

 Deploy shards from HDFS into local cores
 

 Key: SOLR-1457
 URL: https://issues.apache.org/jira/browse/SOLR-1457
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 1.5

 Attachments: hadoop-0.19.0-core.jar, SOLR-1457.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 This issue extends CoreAdminHandler to allow installation of new cores from 
 HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1458:
---

Attachment: SolrDeletionPolicy.patch

Here's a partial patch - only to SolrDeletionPolicy that rewrites that logic so 
that all of the options hopefully work correctly.  Now if 
keepOptimizedOnly==true and maxCommitsToKeep==1, then there should always be 
one optimized index commit point.

Should we just document that keepOptimizedOnly needs to be true if you're doing 
replication on optimized commit points only?

Alternately, the replication handler could set parameters on the deletion 
policy - the problem being that the current parameters don't lend themselves to 
being manipulated.  For example if the policy has keepOptimizedOnly=false and 
maxCommitsToKeep==5 then if the replication handler changed 
keepOptimizedOnly to true, we would end up keeping 5 optimized commit points!  
It would be more flexible to be able to specify a separate count for optimized 
commit points.

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst 

Re: Facet query throws NullPointerException when the facetqueries response is null in QueryResponse object

2009-09-25 Thread jrduncans

As I understand it, the problem was occurring when there was an error parsing
the facet-query, then a NullPointerException was resulting.  I'll be
updating our code to the latest trunk soon (and hopefully to 1.4 final soon
after) to verify that the problem is fixed.


Yonik Seeley-2 wrote:
 
 I committed this change, just for defensive purposes...
 but why is facet_queries coming back null for you?  I thought it was
 always added when facet=true.
 
 -Yonik
 http://www.lucidimagination.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Facet-query-throws-NullPointerException-when-the-facetqueries-response-is-null-in-QueryResponse-object-tp25492240p25617752.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



[jira] Updated: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1458:
---

Attachment: SolrDeletionPolicy.patch

Updated patch that implements a maxOptimizedCommitsToKeep parameter - this 
would allow the replication handler to raise it to 1 if necessary.

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in solrcore.properties:
 {code}
 enable.slave=true
 master.url=URLOFMASTER/replication
 {code}
 and the master has
 {code}
 enable.master=true
 {code}
 I'd be glad 

Solr Slave Sync Pulling Snapshot Issue

2009-09-25 Thread Zhao, Jing Jing
Hi, Solr Developers,

We are experiencing the following issue. When Solr Master makes more than one 
commit within the same second of the clock, the Solr Slave snap puller does not 
pull the latest snapshot from Solr Master.

Can anyone advice if this is a known issue or I should submit this as a bug?

Thanks!
Jing Jing Zhao




Fwd: 8 for 1.4

2009-09-25 Thread Grant Ingersoll

Argh, this was meant for solr-dev.

Begin forwarded message:


From: Grant Ingersoll gsing...@apache.org
Date: September 25, 2009 1:34:32 PM EDT
To: solr-u...@lucene.apache.org
Subject: 8 for 1.4
Reply-To: solr-u...@lucene.apache.org

Y'all,

We're down to 8 open issues:  
https://issues.apache.org/jira/secure/BrowseVersion.jspa?id=12310230versionId=12313351showOpenIssuesOnly=true

2 are packaging related, one is dependent on the official 2.9  
release (so should be taken care of today or tomorrow I suspect) and  
then we have a few others.


The only two somewhat major ones are S-1458, S-1294 (more on this in  
a mo') and S-1449.


On S-1294, the SolrJS patch, I yet again have concerns about even  
including this, given the lack of activity (from Matthias, the  
original author and others) and the fact that some in the Drupal  
community have already forked this to fix the various bugs in it  
instead of just submitting patches.  While I really like the idea of  
this library (jQuery is awesome), I have yet to see interest in the  
community to maintain it (unless you count someone forking it and  
fixing the bugs in the fork as maintenance) and I'll be upfront in  
admitting I have neither the time nor the patience to debug  
Javascript across the gazillions of browsers out there (I don't even  
have IE on my machine unless you count firing up a VM w/ XP on it)  
in the wild.  Given what I know of most of the other committers  
here, I suspect that is true for others too.  At a minimum, I think  
S-1294 should be pushed to 1.5.  Next up, I think we consider  
pulling SolrJS from the release, but keeping it in trunk and  
officially releasing it with either 1.5 or 1.4.1, assuming its  
gotten some love in the meantime.  If by then it has no love, I vote  
we remove it and let the fork maintain it and point people there.


-Grant





[jira] Created: (SOLR-1464) CommonsHttpSolrServer does not conform to bean conventions

2009-09-25 Thread Sean Fitzgerald (JIRA)
CommonsHttpSolrServer does not conform to bean conventions
--

 Key: SOLR-1464
 URL: https://issues.apache.org/jira/browse/SOLR-1464
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
Reporter: Sean Fitzgerald
 Attachments: CommonsHttpSolrServer.java-BEAN.patch

Several class variables (baseURL, allowCompression, maxRetries, etc) have 
neither getters nor setters. By creating getters and setters for these 
properties, we can allow other developers to extend CommonsHttpSolrServer with 
additional functionality. It is also then necessary to use these methods 
internally, as opposed to referencing the class variables directly.



For example, by extending a method like 
public String getBaseURL()
One could attach a host monitoring or home-brewed DNS resolution service to 
intercept, thus replicating the functionality of LBHttpSolrServer with very 
little of the code.

Attached is a basic patch (generated using eclipse Source tools), as a minimal 
set of changes. I have not changes the general coding style of the file, though 
that would be preferable. I am open to suggestion on whether these methods 
should be public (as in the attached patch), or protected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1464) CommonsHttpSolrServer does not conform to bean conventions

2009-09-25 Thread Sean Fitzgerald (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Fitzgerald updated SOLR-1464:
--

Attachment: CommonsHttpSolrServer.java-BEAN.patch

Generated bean methods for CommonsHttpSolrServer.java

 CommonsHttpSolrServer does not conform to bean conventions
 --

 Key: SOLR-1464
 URL: https://issues.apache.org/jira/browse/SOLR-1464
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.3
Reporter: Sean Fitzgerald
 Attachments: CommonsHttpSolrServer.java-BEAN.patch


 Several class variables (baseURL, allowCompression, maxRetries, etc) have 
 neither getters nor setters. By creating getters and setters for these 
 properties, we can allow other developers to extend CommonsHttpSolrServer 
 with additional functionality. It is also then necessary to use these methods 
 internally, as opposed to referencing the class variables directly.
 For example, by extending a method like 
 public String getBaseURL()
 One could attach a host monitoring or home-brewed DNS resolution service to 
 intercept, thus replicating the functionality of LBHttpSolrServer with very 
 little of the code.
 Attached is a basic patch (generated using eclipse Source tools), as a 
 minimal set of changes. I have not changes the general coding style of the 
 file, though that would be preferable. I am open to suggestion on whether 
 these methods should be public (as in the attached patch), or protected.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Fwd: Re: [Aperture-devel] pdfbox 0.8.0 released]

2009-09-25 Thread Mark Miller
Not sure that anyone has time, but if someone does, might be worth
investing an upgrade for Solr Cell in 1.4 - as long as Tika still works
with it (I don't see anything at the Tika site saying anything about
versions and I don't know if .8 of PDF is API compat - though I assume so).

Figured it was worth flagging anyway.

-- 
- Mark

http://www.lucidimagination.com



---BeginMessage---
Arjohn Kampman pisze:
 Hi all,
 
 Pdfbox 0.8.0 has (finally) been released this week:
 http://www.mail-archive.com/pdfbox-...@incubator.apache.org/msg01734.html
 
 Considering the huge amount of bug fixes and improvements that it
 contains, it would be nice to update the dependency in aperture to this
 new release. I have created a ticket in the issue tracker for this:
 https://sourceforge.net/tracker/?func=detailaid=2866718group_id=150969atid=779503
 
 Cheers,
 
 Arjohn

done

Antoni


--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
Aperture-devel mailing list
aperture-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aperture-devel
---End Message---


[jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Artem Russakovskii (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759784#action_12759784
 ] 

Artem Russakovskii commented on SOLR-1458:
--

I haven't changed any configs yet, and this probably doesn't come as a shock to 
you guys, but the master just ran out of space. Upon inspection, I found 30+ 
snapshot dirs sitting around in /data.

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in solrcore.properties:
 {code}
 enable.slave=true
 

[jira] Issue Comment Edited: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Artem Russakovskii (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759784#action_12759784
 ] 

Artem Russakovskii edited comment on SOLR-1458 at 9/25/09 3:32 PM:
---

I haven't changed any configs yet, and this probably doesn't come as a shock to 
you guys, but the master just ran out of space. Upon inspection, I found 30+ 
snapshot dirs sitting around in /data.

Paul, adding your deletionPolicy fix didn't delete the files, even after 
optimize. Is that expected?

{code}
drwxrwxr-x  2 bla bla  4096 Sep 23 18:42 snapshot.20090923064214
drwxrwxr-x  2 bla bla  4096 Sep 23 19:15 snapshot.20090923071530
drwxrwxr-x  2 bla bla  4096 Sep 23 19:45 snapshot.20090923074535
drwxrwxr-x  2 bla bla  4096 Sep 23 20:15 snapshot.20090923081531
drwxrwxr-x  2 bla bla  4096 Sep 23 21:15 snapshot.20090923091531
drwxrwxr-x  2 bla bla  4096 Sep 23 22:15 snapshot.20090923101532
drwxrwxr-x  2 bla bla  4096 Sep 23 23:15 snapshot.20090923111533
drwxrwxr-x  2 bla bla  4096 Sep 24 01:15 snapshot.20090924011501
drwxrwxr-x  2 bla bla  4096 Sep 24 13:15 snapshot.20090924011535
drwxrwxr-x  2 bla bla  4096 Sep 24 02:15 snapshot.20090924021501
drwxrwxr-x  2 bla bla  4096 Sep 24 14:15 snapshot.20090924021534
drwxrwxr-x  2 bla bla  4096 Sep 24 15:15 snapshot.20090924031501
drwxrwxr-x  2 bla bla  4096 Sep 24 03:15 snapshot.20090924031502
drwxrwxr-x  2 bla bla  4096 Sep 24 04:15 snapshot.20090924041501
drwxrwxr-x  2 bla bla  4096 Sep 24 16:15 snapshot.20090924041536
drwxrwxr-x  2 bla bla  4096 Sep 24 05:15 snapshot.20090924051501
drwxrwxr-x  2 bla bla  4096 Sep 24 17:15 snapshot.20090924051537
drwxrwxr-x  2 bla bla  4096 Sep 24 06:15 snapshot.20090924061501
drwxrwxr-x  2 bla bla  4096 Sep 24 18:15 snapshot.20090924061534
drwxrwxr-x  2 bla bla  4096 Sep 24 07:15 snapshot.20090924071501
drwxrwxr-x  2 bla bla  4096 Sep 24 19:15 snapshot.20090924071533
drwxrwxr-x  2 bla bla  4096 Sep 24 08:15 snapshot.20090924081534
drwxrwxr-x  2 bla bla  4096 Sep 24 20:15 snapshot.20090924081535
drwxrwxr-x  2 bla bla  4096 Sep 24 09:15 snapshot.20090924091501
drwxrwxr-x  2 bla bla  4096 Sep 24 21:15 snapshot.20090924091532
drwxrwxr-x  2 bla bla  4096 Sep 24 10:15 snapshot.20090924101501
drwxrwxr-x  2 bla bla  4096 Sep 24 22:15 snapshot.20090924101533
drwxrwxr-x  2 bla bla  4096 Sep 24 11:15 snapshot.20090924111501
drwxrwxr-x  2 bla bla  4096 Sep 24 23:15 snapshot.20090924111532
drwxrwxr-x  2 bla bla  4096 Sep 24 12:15 snapshot.20090924121532
drwxrwxr-x  2 bla bla  4096 Sep 24 00:15 snapshot.20090924121533
drwxrwxr-x  2 bla bla  4096 Sep 25 01:15 snapshot.20090925011533
drwxrwxr-x  2 bla bla  4096 Sep 25 13:15 snapshot.20090925011540
drwxrwxr-x  2 bla bla  4096 Sep 25 02:15 snapshot.20090925021534
drwxrwxr-x  2 bla bla  4096 Sep 25 14:15 snapshot.20090925021540
drwxrwxr-x  2 bla bla  4096 Sep 25 03:15 snapshot.20090925031535
drwxrwxr-x  2 bla bla  4096 Sep 25 15:15 snapshot.20090925031540
drwxrwxr-x  2 bla bla  4096 Sep 25 15:29 snapshot.20090925032931
drwxrwxr-x  2 bla bla  4096 Sep 25 04:15 snapshot.20090925041535
drwxrwxr-x  2 bla bla  4096 Sep 25 05:15 snapshot.20090925051539
drwxrwxr-x  2 bla bla  4096 Sep 25 06:15 snapshot.20090925061538
drwxrwxr-x  2 bla bla  4096 Sep 25 07:15 snapshot.20090925071539
drwxrwxr-x  2 bla bla  4096 Sep 25 08:15 snapshot.20090925081539
drwxrwxr-x  2 bla bla  4096 Sep 25 09:15 snapshot.20090925091538
drwxrwxr-x  2 bla bla  4096 Sep 25 09:52 snapshot.20090925095213
drwxrwxr-x  2 bla bla  4096 Sep 25 10:15 snapshot.20090925101540
drwxrwxr-x  2 bla bla  4096 Sep 25 11:15 snapshot.20090925111538
drwxrwxr-x  2 bla bla  4096 Sep 25 00:15 snapshot.20090925121534
drwxrwxr-x  2 bla bla  4096 Sep 25 12:15 snapshot.20090925121538
{code}

  was (Author: archon810):
I haven't changed any configs yet, and this probably doesn't come as a 
shock to you guys, but the master just ran out of space. Upon inspection, I 
found 30+ snapshot dirs sitting around in /data.
  
 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have 

[jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759825#action_12759825
 ] 

Yonik Seeley commented on SOLR-1458:


Hmmm... just happened onto this bit of odd code:

{code}
  void refreshCommitpoint() {
IndexCommit commitPoint = core.getDeletionPolicy().getLatestCommit();
if(replicateOnCommit  !commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
if(replicateOnOptimize  commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
  }
{code}

Looks like a bug... refreshCommitPoint always updates indexCommitPoint 
regardless of commitPoint.


 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str 

[jira] Issue Comment Edited: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759825#action_12759825
 ] 

Yonik Seeley edited comment on SOLR-1458 at 9/25/09 5:16 PM:
-

Hmmm... just happened onto this bit of odd code:

{code}
  void refreshCommitpoint() {
IndexCommit commitPoint = core.getDeletionPolicy().getLatestCommit();
if(replicateOnCommit  !commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
if(replicateOnOptimize  commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
  }
{code}

edit: Looks like a bug... refreshCommitPoint isn't set for optimized indexes if 
only replicateOnCommit is set.



  was (Author: ysee...@gmail.com):
Hmmm... just happened onto this bit of odd code:

{code}
  void refreshCommitpoint() {
IndexCommit commitPoint = core.getDeletionPolicy().getLatestCommit();
if(replicateOnCommit  !commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
if(replicateOnOptimize  commitPoint.isOptimized()){
  indexCommitPoint = commitPoint;
}
  }
{code}

Looks like a bug... refreshCommitPoint always updates indexCommitPoint 
regardless of commitPoint.

  
 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SolrDeletionPolicy.patch, SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 

[jira] Updated: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1458:
---

Attachment: SOLR-1458.patch

Here's a draft of a full patch but
I'm getting some relatively non-reproducible test failures in a bunch of places 
- not sure what's up yet.

 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SOLR-1458.patch, SolrDeletionPolicy.patch, 
 SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file --
 str name=masterUrl${master.url}/str
 !-- how often to check master --
 str name=pollInterval00:00:30/str
 /lst
   /requestHandler
 {code}
 The slave then has this in solrcore.properties:
 {code}
 enable.slave=true
 master.url=URLOFMASTER/replication
 {code}
 and the master has
 {code}
 enable.master=true
 {code}
 

[jira] Commented: (SOLR-1458) Java Replication error: NullPointerException SEVERE: SnapPull failed on 2009-09-22 nightly

2009-09-25 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12759851#action_12759851
 ] 

Lance Norskog commented on SOLR-1458:
-

I reported [SOLR-1383|https://issues.apache.org/jira/browse/SOLR-1383] a few 
weeks ago. It is one edge case of what you're all working on.  

Short version: running add 1 document/commit/replicate continuously is a 
reliable way to cause the deletion policy to misfire.

Try the [detailed test 
scenario|https://issues.apache.org/jira/browse/SOLR-1383?focusedCommentId=12749190page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12749190].


 Java Replication error: NullPointerException SEVERE: SnapPull failed on 
 2009-09-22 nightly
 --

 Key: SOLR-1458
 URL: https://issues.apache.org/jira/browse/SOLR-1458
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 1.4
 Environment: CentOS x64
 8GB RAM
 Tomcat, running with 7G max memory; memory usage is 2GB, so it's not the 
 problem
 Host a: master
 Host b: slave
 Multiple single core Solr instances, using JNDI.
 Java replication
Reporter: Artem Russakovskii
Assignee: Noble Paul
 Fix For: 1.4

 Attachments: SOLR-1458.patch, SOLR-1458.patch, SOLR-1458.patch, 
 SOLR-1458.patch, SOLR-1458.patch, SolrDeletionPolicy.patch, 
 SolrDeletionPolicy.patch


 After finally figuring out the new Java based replication, we have started 
 both the slave and the master and issued optimize to all master Solr 
 instances. This triggered some replication to go through just fine, but it 
 looks like some of it is failing.
 Here's what I'm getting in the slave logs, repeatedly for each shard:
 {code} 
 SEVERE: SnapPull failed 
 java.lang.NullPointerException
 at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:271)
 at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:258)
 at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 
 java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 {code} 
 If I issue an optimize again on the master to one of the shards, it then 
 triggers a replication and replicates OK. I have a feeling that these 
 SnapPull failures appear later on but right now I don't have enough to form a 
 pattern.
 Here's replication.properties on one of the failed slave instances.
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 19:35:30 PDT 2009
 replicationFailedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 previousCycleTimeInSeconds=0
 timesFailed=113
 indexReplicatedAtList=1253759730020,1253759700018,1253759670019,1253759640018,1253759610018,1253759580022,1253759550019,1253759520016,1253759490026,1253759460016
 indexReplicatedAt=1253759730020
 replicationFailedAt=1253759730020
 lastCycleBytesDownloaded=0
 timesIndexReplicated=113
 {code}
 and another
 {code}
 cat data/replication.properties 
 #Replication details
 #Wed Sep 23 18:42:01 PDT 2009
 replicationFailedAtList=1253756490034,1253756460169
 previousCycleTimeInSeconds=1
 timesFailed=2
 indexReplicatedAtList=1253756521284,1253756490034,1253756460169
 indexReplicatedAt=1253756521284
 replicationFailedAt=1253756490034
 lastCycleBytesDownloaded=22932293
 timesIndexReplicated=3
 {code}
 Some relevant configs:
 In solrconfig.xml:
 {code}
 !-- For docs see http://wiki.apache.org/solr/SolrReplication --
   requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=master
 str name=enable${enable.master:false}/str
 str name=replicateAfteroptimize/str
 str name=backupAfteroptimize/str
 str name=commitReserveDuration00:00:20/str
 /lst
 lst name=slave
 str name=enable${enable.slave:false}/str
 !-- url of master, from properties file