date:20130614

[jira] [Commented] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684089#comment-13684089
 ] 

Commit Tag Bot commented on LUCENE-5038:


[branch_4x commit] simonw
http://svn.apache.org/viewvc?view=revision&revision=1493319

LUCENE-5038: Disable CFS on IWC for TestTermInfosReaderIndex non-CFS files are 
expected

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 5.0, 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b93) - Build # 6074 - Failure!

2013-06-14 Thread Simon Willnauer

This is a test-bug, I missed to call IWC#setCompoundFile(false) - will
commit a fix shortly


On Sat, Jun 15, 2013 at 8:04 AM, Policeman Jenkins Server <
jenk...@thetaphi.de> wrote:

> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6074/
> Java: 64bit/jdk1.8.0-ea-b93 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC
>
> 2 tests failed.
> FAILED:
>  
> junit.framework.TestSuite.org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex
>
> Error Message:
> _0.fnm in 
> dir=org.apache.lucene.store.RAMDirectory@5a63175flockFactory=org.apache.lucene.store.SingleInstanceLockFactory@54a4f263
>
> Stack Trace:
> java.io.FileNotFoundException: _0.fnm in
> dir=org.apache.lucene.store.RAMDirectory@5a63175flockFactory=org.apache.lucene.store.SingleInstanceLockFactory@54a4f263
> at __randomizedtesting.SeedInfo.seed([48306F3AE67315D5]:0)
> at
> org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:547)
> at
> org.apache.lucene.codecs.lucene3x.PreFlexRWFieldInfosReader.read(PreFlexRWFieldInfosReader.java:45)
> at
> org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex.beforeClass(TestTermInfosReaderIndex.java:99)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:491)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:677)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at java.lang.Thread.run(Thread.java:724)
>
>
> FAILED:
>  
> junit.framework.TestSuite.org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex
>
> Error Message:
>
>
> Stack Trace:
> java.lang.NullPointerException
> at __randomizedtesting.SeedInfo.seed([48306F3AE67315D5]:0)
> at
> org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex.afterClass(TestTermInfosReaderIndex.java:117)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:491)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
> at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate

[JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-ea-b93) - Build # 6074 - Failure!

2013-06-14 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6074/
Java: 64bit/jdk1.8.0-ea-b93 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex

Error Message:
_0.fnm in dir=org.apache.lucene.store.RAMDirectory@5a63175f 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@54a4f263

Stack Trace:
java.io.FileNotFoundException: _0.fnm in 
dir=org.apache.lucene.store.RAMDirectory@5a63175f 
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@54a4f263
at __randomizedtesting.SeedInfo.seed([48306F3AE67315D5]:0)
at 
org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:547)
at 
org.apache.lucene.codecs.lucene3x.PreFlexRWFieldInfosReader.read(PreFlexRWFieldInfosReader.java:45)
at 
org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex.beforeClass(TestTermInfosReaderIndex.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:677)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:724)


FAILED:  
junit.framework.TestSuite.org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex

Error Message:


Stack Trace:
java.lang.NullPointerException
at __randomizedtesting.SeedInfo.seed([48306F3AE67315D5]:0)
at 
org.apache.lucene.codecs.lucene3x.TestTermInfosReaderIndex.afterClass(TestTermInfosReaderIndex.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:700)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRul

[jira] [Commented] (SOLR-4791) solr.xml sharedLib does not work in 4.3.0

2013-06-14 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684082#comment-13684082
 ] 

Shalin Shekhar Mangar commented on SOLR-4791:
-

Ah, so what should we do now? I saw the commit bot entry on lucene_solr_4_3 and 
assumed that erick had backported this issue. The release vote has passed and 
artifacts are making their way into the mirrors.

> solr.xml sharedLib does not work in 4.3.0
> -
>
> Key: SOLR-4791
> URL: https://issues.apache.org/jira/browse/SOLR-4791
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 4.3
>Reporter: Jan Høydahl
>Assignee: Erick Erickson
> Fix For: 5.0, 4.4
>
> Attachments: closeLoader.patch, SOLR-4791.patch, SOLR-4791.patch, 
> SOLR-4791.patch, SOLR-4791-test.patch
>
>
> The sharedLib attribute on {{}} tag in solr.xml stopped working in 4.3.
> Using old-style solr.xml with sharedLib defined on solr tag. Solr does not 
> load any of them. Simply swapping out solr.war with the 4.2.1 one, brings 
> sharedLib loading back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-4923) Replica index is one version behind sending the commit to non-leader instance

2013-06-14 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-4923:
-

Assignee: Mark Miller

> Replica index is one version behind sending the commit to non-leader instance
> -
>
> Key: SOLR-4923
> URL: https://issues.apache.org/jira/browse/SOLR-4923
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 4.2
> Environment: Solr 4.2.1
> OS X 10.8.3
>Reporter: Ricardo Merizalde
>Assignee: Mark Miller
>Priority: Critical
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4923_hoss_test.patch, SOLR-4923.patch, 
> SOLR-4923-test-1.patch
>
>
> I was actually trying to debug an issue we experiencing in production where 
> the replica version is ahead from the leader when I noticed this problem.
> For my tests I'm running two Solr instances with distributed updates 
> (SolrCloud). ZK runs embedded within one of the instances.
> The test consists on updating one field in single document. If I send an 
> update to the leader the index is replicated correctly. However, if I run the 
> update against the follower replica only the leader is updated correctly. I 
> can reproduce this using both hard and soft commits. Here is the command I'm 
> running:
> curl 
> "http://localhost:8999/solr/rulePreview/update?commit=true&softCommit=true"; 
> -H "Content-Type: text/xml" --data-binary '...
> If I execute a second commit against the follower the leader will have the 
> most recent update and the follower will be update from the first commit.
> For example, my field is named category and initially it contains the value 
> cat_1. If update the value to cat_2 the leader sees the change but the 
> follower doesn't. If a second commit updates the field to cat_3 the leader 
> will return cat_3 but the follower return cat_2. 
> Reloading the core in the follower fixes the problem.
> The logs seem to confirm the follower gets the latest index version. However, 
> the version in the logs doesn't matches the on in the Core Admin UI nor Luke. 
> Here are some logs from the leader:
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.processor.LogUpdateProcessor 
> finish
> INFO: [rulePreview_en] webapp=/solr path=/update 
> params={distrib.from=http://192.168.1.106:8998/solr/rulePreview_en/&update.distrib=TOLEADER&wt=javabin&version=2}
>  {add=[importedRedirect1 (1437700518392627200)]} 0 11
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
> Jun 12, 2013 10:34:19 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening Searcher@47e4e06c main
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@47e4e06c 
> main{StandardDirectoryReader(segments_3g:467:nrt _2a(4.2.1):C134/1 
> _3c(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [rulePreview_en] Registered new searcher Searcher@47e4e06c 
> main{StandardDirectoryReader(segments_3g:467:nrt _2a(4.2.1):C134/1 
> _3c(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.processor.LogUpdateProcessor 
> finish
> INFO: [rulePreview_en] webapp=/solr path=/update 
> params={waitSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=true}
>  {commit=} 0 12
> And the logs from the follower:
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
> Jun 12, 2013 10:34:19 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening Searcher@1e23cfc main
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@1e23cfc 
> main{StandardDirectoryReader(segments_3i:463:nrt _2a(4.2.1):C134/1 
> _3b(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [rulePreview_en] Registered new searcher Searcher@1e23cfc 
> main{StandardDirectoryReader(segments_3i:463:nrt _2a(4.2.1):C134/1 
> _3b(4.2.1):C1)}
> Jun 12, 2013 10:34:1

Adding a mixture of language models to Lucene 4.0

2013-06-14 Thread Nikita Zhiltsov

Hi all,

I've just published a tiny extension to Lucene 4.0, which enables a mixture
of language models using standard FunctionQuery and ValueSource classes:
https://github.com/nzhiltsov/lucene-mlm

I'd like you to assess the possibility of integrating this code into
Lucene. Appreciate any comments or fixes.

NB. The implementation avoids using LMSimilarity per field basis,
because it would break the computation of correct Dirichlet priors for
non-matched terms, which the standard class LMSimilarity fails to include
while calculating term frequencies and treats them as zero probability
entries.

-- 

Nikita Zhiltsov

Visiting Graduate Student
Emory University
Intelligent Information Access Lab
E500 Emerson Hall, Atlanta, Georgia, USA
Phone: (404) 834-5364
E-mail: znik...@emory.edu


-
Graduate Student, Research Fellow
Kazan Federal University
Computational Linguistics Laboratory
Russia, 420008
Kazan, Prof. Nuzhina Str., 1/37 room 117
Skype: nickita.jhiltsov
Personal page: http://cll.niimm.ksu.ru/~nzhiltsov
E-mail: nikita.zhilt...@gmail.com

-

[jira] [Updated] (SOLR-4923) Replica index is one version behind sending the commit to non-leader instance

2013-06-14 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4923:
---

Attachment: SOLR-4923_hoss_test.patch

[~markrmil...@gmail.com] here's a patch that modifies BasicDistributedZkTest to 
reliably produce a failure which i _think_ is the same as the one being 
dicussed here.

with your patch applied, my patch stops failing.

> Replica index is one version behind sending the commit to non-leader instance
> -
>
> Key: SOLR-4923
> URL: https://issues.apache.org/jira/browse/SOLR-4923
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java)
>Affects Versions: 4.2
> Environment: Solr 4.2.1
> OS X 10.8.3
>Reporter: Ricardo Merizalde
>Priority: Critical
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4923_hoss_test.patch, SOLR-4923.patch, 
> SOLR-4923-test-1.patch
>
>
> I was actually trying to debug an issue we experiencing in production where 
> the replica version is ahead from the leader when I noticed this problem.
> For my tests I'm running two Solr instances with distributed updates 
> (SolrCloud). ZK runs embedded within one of the instances.
> The test consists on updating one field in single document. If I send an 
> update to the leader the index is replicated correctly. However, if I run the 
> update against the follower replica only the leader is updated correctly. I 
> can reproduce this using both hard and soft commits. Here is the command I'm 
> running:
> curl 
> "http://localhost:8999/solr/rulePreview/update?commit=true&softCommit=true"; 
> -H "Content-Type: text/xml" --data-binary '...
> If I execute a second commit against the follower the leader will have the 
> most recent update and the follower will be update from the first commit.
> For example, my field is named category and initially it contains the value 
> cat_1. If update the value to cat_2 the leader sees the change but the 
> follower doesn't. If a second commit updates the field to cat_3 the leader 
> will return cat_3 but the follower return cat_2. 
> Reloading the core in the follower fixes the problem.
> The logs seem to confirm the follower gets the latest index version. However, 
> the version in the logs doesn't matches the on in the Core Admin UI nor Luke. 
> Here are some logs from the leader:
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.processor.LogUpdateProcessor 
> finish
> INFO: [rulePreview_en] webapp=/solr path=/update 
> params={distrib.from=http://192.168.1.106:8998/solr/rulePreview_en/&update.distrib=TOLEADER&wt=javabin&version=2}
>  {add=[importedRedirect1 (1437700518392627200)]} 0 11
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
> Jun 12, 2013 10:34:19 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening Searcher@47e4e06c main
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@47e4e06c 
> main{StandardDirectoryReader(segments_3g:467:nrt _2a(4.2.1):C134/1 
> _3c(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: [rulePreview_en] Registered new searcher Searcher@47e4e06c 
> main{StandardDirectoryReader(segments_3g:467:nrt _2a(4.2.1):C134/1 
> _3c(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.processor.LogUpdateProcessor 
> finish
> INFO: [rulePreview_en] webapp=/solr path=/update 
> params={waitSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=true}
>  {commit=} 0 12
> And the logs from the follower:
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
> Jun 12, 2013 10:34:19 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening Searcher@1e23cfc main
> Jun 12, 2013 10:34:19 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to Searcher@1e23cfc 
> main{StandardDirectoryReader(segments_3i:463:nrt _2a(4.2.1):C134/1 
> _3b(4.2.1):C1)}
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener done.
> Jun 12, 2013 10:34:19 PM org.apache.solr.core.So

[jira] [Updated] (SOLR-4791) solr.xml sharedLib does not work in 4.3.0

2013-06-14 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/SOLR-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4791:
--

Fix Version/s: (was: 4.3.1)

> solr.xml sharedLib does not work in 4.3.0
> -
>
> Key: SOLR-4791
> URL: https://issues.apache.org/jira/browse/SOLR-4791
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 4.3
>Reporter: Jan Høydahl
>Assignee: Erick Erickson
> Fix For: 5.0, 4.4
>
> Attachments: closeLoader.patch, SOLR-4791.patch, SOLR-4791.patch, 
> SOLR-4791.patch, SOLR-4791-test.patch
>
>
> The sharedLib attribute on {{}} tag in solr.xml stopped working in 4.3.
> Using old-style solr.xml with sharedLib defined on solr tag. Solr does not 
> load any of them. Simply swapping out solr.war with the 4.2.1 one, brings 
> sharedLib loading back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4791) solr.xml sharedLib does not work in 4.3.0

2013-06-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683878#comment-13683878
 ] 

Jan Høydahl commented on SOLR-4791:
---

Ok, I see what happened, the CHANGES commit 1481043 went to lucene_solr_4_3, 
but it should have been on branch_4x. You fix?

> solr.xml sharedLib does not work in 4.3.0
> -
>
> Key: SOLR-4791
> URL: https://issues.apache.org/jira/browse/SOLR-4791
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 4.3
>Reporter: Jan Høydahl
>Assignee: Erick Erickson
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: closeLoader.patch, SOLR-4791.patch, SOLR-4791.patch, 
> SOLR-4791.patch, SOLR-4791-test.patch
>
>
> The sharedLib attribute on {{}} tag in solr.xml stopped working in 4.3.
> Using old-style solr.xml with sharedLib defined on solr tag. Solr does not 
> load any of them. Simply swapping out solr.war with the 4.2.1 one, brings 
> sharedLib loading back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4924) indices getting out of sync with SolrCloud

2013-06-14 Thread Luis Guerrero (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683849#comment-13683849
 ] 

Luis Guerrero commented on SOLR-4924:
-

Wouldn't the core with the higher version number have to be the leader? I'm 
also experiencing this issue, but the replica is only about 100 versions 
behind. Mine is also a single sharded setup, while the issue markus mentions is 
for multiple shards so I'm not sure if this is really a duplicate. I'm 
currently running solr 4.1.0.

> indices getting out of sync with SolrCloud
> --
>
> Key: SOLR-4924
> URL: https://issues.apache.org/jira/browse/SOLR-4924
> Project: Solr
>  Issue Type: Bug
>  Components: replication (java), SolrCloud
>Affects Versions: 4.2
> Environment: Linux 2.6.18-308.16.1.el5 #1 SMP Tue Oct 2 22:01:43 EDT 
> 2012 x86_64 x86_64 x86_64 GNU/Linux
> CentOS release 5.8 (Final)
> Solr 4.2.1
>Reporter: Ricardo Merizalde
>
> We are experiencing an issue in our production servers where the indices get 
> out of sync. Customers will see different results/result sorting depending of 
> the instance that serves the request.
> We currently have 2 instances with a single shard. This is our update handler 
> configuration
> 
>   
> 
> 60
> 
> 5000
> 
> false
>   
>   
> 
> 5000
>   
>   
> ${solr.data.dir:}
>   
> 
> When the indices get out of sync the follower replica ends up with a higher 
> version than the master. Optimizing the leader or reloading the follower core 
> doesn't not help. The only why to get the indices in sync is to restart the 
> server.
> This is an state example of the leader:
> version: 1102541
> numDocs: 214007
> maxDoc: 370861
> deletedDocs: 156854 
> While the follower core has the following state:
> version: 1109143
> numDocs: 213890
> maxDoc: 341585
> deletedDocs: 127695 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 1332 - Still Failing

2013-06-14 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/1332/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestFieldsReader.testExceptions

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at 
__randomizedtesting.SeedInfo.seed([14FE64D04A4B2A45:62FF3678E23654F3]:0)
at org.apache.lucene.util.BytesRef.copyBytes(BytesRef.java:196)
at org.apache.lucene.util.BytesRef.deepCopyOf(BytesRef.java:343)
at 
org.apache.lucene.codecs.lucene3x.TermBuffer.toTerm(TermBuffer.java:113)
at 
org.apache.lucene.codecs.lucene3x.SegmentTermEnum.term(SegmentTermEnum.java:184)
at 
org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTermsEnum.next(Lucene3xFields.java:863)
at 
org.apache.lucene.index.MultiTermsEnum.pushTop(MultiTermsEnum.java:292)
at org.apache.lucene.index.MultiTermsEnum.next(MultiTermsEnum.java:318)
at org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:103)
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:365)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3762)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3366)
at 
org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:40)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1882)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1692)
at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1645)
at 
org.apache.lucene.index.TestFieldsReader.testExceptions(TestFieldsReader.java:204)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)




Build Log:
[...truncated 1083 lines...]
[junit4:junit4] Suite: org.apache.lucene.index.TestFieldsReader
[junit4:junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestFieldsReader -Dtests.method=testExceptions 
-Dtests.seed=14FE64D04A4B2A45 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=de_DE -Dtests.timezone=Australia/Queensland 
-Dtests.file.encoding=US-ASCII
[junit4:junit4] ERROR   3.17s J0 | TestFieldsReader.testExceptions <<<
[junit4:junit4]> Throwable #1: java.lang.OutOfMemoryError: Java heap space
[junit4:junit4]>at 
__randomizedtesting.SeedInfo.seed([14FE64D04A4B2A45:62FF3678E23654F3]:0)
[junit4:junit4]>at 
org.apache.lucene.util.BytesRef.copyBytes(BytesRef.java:196)
[junit4:junit4]>at 
org.apache.lucene.util.BytesRef.deepCopyOf(BytesRef.java:343)
[junit4:junit4]>at 
org.apache.lucene.codecs.lucene3x.TermBuffer.toTerm(TermBuffer.java:113)
[junit4:junit4]>at 
org.apache.lucene.codecs.lucene3x.SegmentTermEnum.term(SegmentTermEnum.java:184)
[junit4:junit4]>at 
org.apache.lucene.codecs.lucene3x.Lucene3xFields$PreTermsEnum.next(Lucene3xFields.java:863)
[junit4:junit4]>at 
org.apache.lucene.index.MultiTermsEnum.pushTop(MultiTermsEnum.java:292)
[junit4:junit4]>at 
org.apache.lucene.index.MultiTermsEnum.next(MultiTermsEnum.java:318)
[junit4:junit4]>at 
org.apache.lucene.codecs.TermsConsumer.merge(TermsConsumer.java:103)
[junit4:junit4]>at 
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:72)
[junit4:junit4]>at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:365)
[junit4:junit4]>at 
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:98)
[junit4:junit4]>at 
org.apache.lucene.

Fwd: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066 - Still Failing!

2013-06-14 Thread david.w.smi...@gmail.com

Dawid,

Could you please take a look at the reproducibility of this test failure in
lucene/spatial?  I tried to reproduce it but couldn't, and I thought
perhaps you might have some insight because I'm using some
RandomizedTesting features that aren't as often used, like @Repeat.  For
example, one thing fishy is this log message:

[junit4:junit4]   2> NOTE: reproduce with: ant test
 -Dtestcase=SpatialOpRecursivePrefixTreeTest -Dtests.method="testContains
{#1 seed=[9166D28D6532217A:472BE5C4B7344982]}"
-Dtests.seed=9166D28D6532217A -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.locale=uk_UA -Dtests.timezone=Etc/GMT-6 -Dtests.file.encoding=UTF-8

Notice the -Dtests.method="testContains {#1
seed=[9166D28D6532217A:472BE5C4B7344982]}" part, which is wrong because if
I do that, it'll not find the method to test.  If I change this to simply
testContains, and set the seed normally -Dtests.seed=91 then I still
can't reproduce the problem.  This test appears to have failed a bunch of
times lately with different seeds.

~ David

-- Forwarded message --
From: Policeman Jenkins Server 
Date: Fri, Jun 14, 2013 at 9:33 PM
Subject: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066
- Still Failing!
To: dev@lucene.apache.org


Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6066/
Java: 32bit/jdk1.6.0_45 -server -XX:+UseSerialGC

1 tests failed.
FAILED:
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains
{#1 seed=[9166D28D6532217A:472BE5C4B7344982]}

Error Message:
Shouldn't match I
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)

Stack Trace:
java.lang.AssertionError: Shouldn't match I
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
at
__randomizedtesting.SeedInfo.seed([9166D28D6532217A:472BE5C4B7344982]:0)
at org.junit.Assert.fail(Assert.java:93)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)

[jira] [Commented] (SOLR-4791) solr.xml sharedLib does not work in 4.3.0

2013-06-14 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683834#comment-13683834
 ] 

Erick Erickson commented on SOLR-4791:
--

I certainly never backported it, and there's no commit bag bot entry that'd 
make me believe it was.

Erick

> solr.xml sharedLib does not work in 4.3.0
> -
>
> Key: SOLR-4791
> URL: https://issues.apache.org/jira/browse/SOLR-4791
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 4.3
>Reporter: Jan Høydahl
>Assignee: Erick Erickson
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: closeLoader.patch, SOLR-4791.patch, SOLR-4791.patch, 
> SOLR-4791.patch, SOLR-4791-test.patch
>
>
> The sharedLib attribute on {{}} tag in solr.xml stopped working in 4.3.
> Using old-style solr.xml with sharedLib defined on solr tag. Solr does not 
> load any of them. Simply swapping out solr.war with the 4.2.1 one, brings 
> sharedLib loading back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4791) solr.xml sharedLib does not work in 4.3.0

2013-06-14 Thread JIRA


[ 
https://issues.apache.org/jira/browse/SOLR-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683761#comment-13683761
 ] 

Jan Høydahl commented on SOLR-4791:
---

Testing the new 4.3.1 - still not getting sharedLib to work. Seems this issue 
was never backported to 4.3.1 even if the CHANGES entry was???

> solr.xml sharedLib does not work in 4.3.0
> -
>
> Key: SOLR-4791
> URL: https://issues.apache.org/jira/browse/SOLR-4791
> Project: Solr
>  Issue Type: Bug
>  Components: multicore
>Affects Versions: 4.3
>Reporter: Jan Høydahl
>Assignee: Erick Erickson
> Fix For: 5.0, 4.4, 4.3.1
>
> Attachments: closeLoader.patch, SOLR-4791.patch, SOLR-4791.patch, 
> SOLR-4791.patch, SOLR-4791-test.patch
>
>
> The sharedLib attribute on {{}} tag in solr.xml stopped working in 4.3.
> Using old-style solr.xml with sharedLib defined on solr tag. Solr does not 
> load any of them. Simply swapping out solr.war with the 4.2.1 one, brings 
> sharedLib loading back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683756#comment-13683756
 ] 

Commit Tag Bot commented on LUCENE-5038:


[branch_4x commit] simonw
http://svn.apache.org/viewvc?view=revision&revision=1493238

LUCENE-5038: Set useCFS on IWC in backwards-compatibility tests

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 5.0, 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b93) - Build # 6067 - Still Failing!

2013-06-14 Thread Simon Willnauer

I committed a fix


On Fri, Jun 14, 2013 at 10:33 PM, Simon Willnauer  wrote:

> I am on it.
>
>
> On Fri, Jun 14, 2013 at 10:32 PM, Uwe Schindler  wrote:
>
>> Seems related to the changes in merge policies and cfs.
>>
>>
>>
>> Policeman Jenkins Server  schrieb:
>>>
>>>  Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6067/
>>> Java: 32bit/jdk1.8.0-ea-b93 -client -XX:+UseParallelGC
>>>
>>> 1 tests failed.
>>> REGRESSION:  
>>> org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames
>>>
>>> Error Message:
>>> incorrect filenames in index: expected: _0.cfe _0.cfs _0.si 
>>> _0_1.del segments.gen segments_2  or _0.cfe _0.cfs _0.si
>>>  _0_1.liv segments.gen segments_2  actual: _0.fdt _0.fdx
>>>  _0.fnm _0.nvd _0.nvm _0.si _0.tvd _0.tvx _0_1.del  
>>>_0_Asserting_0.dvd _0_Asserting_0.dvm _0_CheapBastard_0.dvdd 
>>> _0_CheapBastard_0.dvdm _0_Direct_0.doc _0_Direct_0.pay 
>>> _0_Direct_0.pos _0_Direct_0.tim _0_Direct_0.tip 
>>> _0_Lucene41_0.doc _0_Lucene41_0.pay
>>> _0_Lucene41_0.pos _0_Lucene41_0.tim _0_Lucene41_0.tip 
>>> _0_Lucene42_0.dvd _0_Lucene42_0.dvm _0_Memory_0.ram 
>>> _0_SimpleText_0.dat _0_SimpleText_0.pst segments.gen segments_2
>>>
>>> Stack Trace:
>>>
>>> java.lang.AssertionError: incorrect filenames in index: expected:
>>> _0.cfe
>>> _0.cfs
>>> _0.si
>>> _0_1.del
>>> segments.gen
>>> segments_2
>>>  or _0.cfe
>>> _0.cfs
>>>
>>> _0.si
>>> _0_1.liv
>>> segments.gen
>>> segments_2
>>>  actual:
>>> _0.fdt
>>> _0.fdx
>>> _0.fnm
>>> _0.nvd
>>> _0.nvm
>>> _0.si
>>>
>>> _0.tvd
>>> _0.tvx
>>> _0_1.del
>>> _0_Asserting_0.dvd
>>> _0_Asserting_0.dvm
>>> _0_CheapBastard_0.dvdd
>>> _0_CheapBastard_0.dvdm
>>> _0_Direct_0.doc
>>> _0_Direct_0.pay
>>> _0_Direct_0.pos
>>>
>>> _0_Direct_0.tim
>>>
>>> _0_Direct_0.tip
>>> _0_Lucene41_0.doc
>>> _0_Lucene41_0.pay
>>> _0_Lucene41_0.pos
>>> _0_Lucene41_0.tim
>>> _0_Lucene41_0.tip
>>> _0_Lucene42_0.dvd
>>> _0_Lucene42_0.dvm
>>> _0_Memory_0.ram
>>> _0_SimpleText_0.dat
>>>
>>> _0_SimpleText_0.pst
>>> segments.gen
>>> segments_2
>>>  at __randomizedtesting.SeedInfo.seed([B00E90C39659DDF1:41A53BC4769E4F38]:0)
>>>  at org.junit.Assert.fail(Assert.java:93)
>>>  at 
>>> org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames(TestBackwardsCompatibility3x.java:656)
>>>
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>  at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>  at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>>  at java.lang.reflect.Method.invoke(Method.java:491)
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>>>  at
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>>>
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>>>  at 
>>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>>>  at 
>>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>>>
>>>  at 
>>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>>  at 
>>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>>
>>>  at 
>>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>>>  at
>>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>>>  at 
>>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>>>  at 
>>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>>
>>>  at 
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>>>  at 
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>>>  at 
>>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>>>
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>>>  at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>>>
>>>  at
>>> com.ca

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b93) - Build # 6067 - Still Failing!

2013-06-14 Thread Simon Willnauer

I am on it.


On Fri, Jun 14, 2013 at 10:32 PM, Uwe Schindler  wrote:

> Seems related to the changes in merge policies and cfs.
>
>
>
> Policeman Jenkins Server  schrieb:
>>
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6067/
>> Java: 32bit/jdk1.8.0-ea-b93 -client -XX:+UseParallelGC
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames
>>
>> Error Message:
>> incorrect filenames in index: expected: _0.cfe _0.cfs _0.si 
>> _0_1.del segments.gen segments_2  or _0.cfe _0.cfs _0.si 
>> _0_1.liv segments.gen segments_2  actual: _0.fdt _0.fdx 
>> _0.fnm _0.nvd _0.nvm _0.si _0.tvd _0.tvx _0_1.del
>>  _0_Asserting_0.dvd _0_Asserting_0.dvm _0_CheapBastard_0.dvdd 
>> _0_CheapBastard_0.dvdm _0_Direct_0.doc _0_Direct_0.pay 
>> _0_Direct_0.pos _0_Direct_0.tim _0_Direct_0.tip 
>> _0_Lucene41_0.doc _0_Lucene41_0.pay
>> _0_Lucene41_0.pos _0_Lucene41_0.tim _0_Lucene41_0.tip 
>> _0_Lucene42_0.dvd _0_Lucene42_0.dvm _0_Memory_0.ram 
>> _0_SimpleText_0.dat _0_SimpleText_0.pst segments.gen segments_2
>>
>> Stack Trace:
>> java.lang.AssertionError: incorrect filenames in index: expected:
>> _0.cfe
>> _0.cfs
>> _0.si
>> _0_1.del
>> segments.gen
>> segments_2
>>  or _0.cfe
>> _0.cfs
>> _0.si
>> _0_1.liv
>> segments.gen
>> segments_2
>>  actual:
>> _0.fdt
>> _0.fdx
>> _0.fnm
>> _0.nvd
>> _0.nvm
>> _0.si
>> _0.tvd
>> _0.tvx
>> _0_1.del
>> _0_Asserting_0.dvd
>> _0_Asserting_0.dvm
>> _0_CheapBastard_0.dvdd
>> _0_CheapBastard_0.dvdm
>> _0_Direct_0.doc
>> _0_Direct_0.pay
>> _0_Direct_0.pos
>> _0_Direct_0.tim
>>
>> _0_Direct_0.tip
>> _0_Lucene41_0.doc
>> _0_Lucene41_0.pay
>> _0_Lucene41_0.pos
>> _0_Lucene41_0.tim
>> _0_Lucene41_0.tip
>> _0_Lucene42_0.dvd
>> _0_Lucene42_0.dvm
>> _0_Memory_0.ram
>> _0_SimpleText_0.dat
>> _0_SimpleText_0.pst
>> segments.gen
>> segments_2
>>  at __randomizedtesting.SeedInfo.seed([B00E90C39659DDF1:41A53BC4769E4F38]:0)
>>  at org.junit.Assert.fail(Assert.java:93)
>>  at 
>> org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames(TestBackwardsCompatibility3x.java:656)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>  at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>  at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  at java.lang.reflect.Method.invoke(Method.java:491)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>>  at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>>  at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>>  at 
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>>  at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>  at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>  at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>>  at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>>  at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>>  at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>  at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>>  at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>>  at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>>  at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>>  at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>>  at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>  at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRul

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b93) - Build # 6067 - Still Failing!

2013-06-14 Thread Uwe Schindler

Seems related to the changes in merge policies and cfs.



Policeman Jenkins Server  schrieb:
>Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6067/
>Java: 32bit/jdk1.8.0-ea-b93 -client -XX:+UseParallelGC
>
>1 tests failed.
>REGRESSION: 
>org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames
>
>Error Message:
>incorrect filenames in index: expected: _0.cfe _0.cfs _0.si
>_0_1.del segments.gen segments_2  or _0.cfe _0.cfs
>_0.si _0_1.liv segments.gen segments_2  actual: _0.fdt 
>_0.fdx _0.fnm _0.nvd _0.nvm _0.si _0.tvd _0.tvx
>_0_1.del _0_Asserting_0.dvd _0_Asserting_0.dvm
>_0_CheapBastard_0.dvdd _0_CheapBastard_0.dvdm _0_Direct_0.doc  
>_0_Direct_0.pay _0_Direct_0.pos _0_Direct_0.tim
>_0_Direct_0.tip _0_Lucene41_0.doc _0_Lucene41_0.pay
>_0_Lucene41_0.pos _0_Lucene41_0.tim _0_Lucene41_0.tip
>_0_Lucene42_0.dvd _0_Lucene42_0.dvm _0_Memory_0.ram
>_0_SimpleText_0.dat _0_SimpleText_0.pst segments.gen
>segments_2
>
>Stack Trace:
>java.lang.AssertionError: incorrect filenames in index: expected:
>_0.cfe
>_0.cfs
>_0.si
>_0_1.del
>segments.gen
>segments_2
> or _0.cfe
>_0.cfs
>_0.si
>_0_1.liv
>segments.gen
>segments_2
> actual:
>_0.fdt
>_0.fdx
>_0.fnm
>_0.nvd
>_0.nvm
>_0.si
>_0.tvd
>_0.tvx
>_0_1.del
>_0_Asserting_0.dvd
>_0_Asserting_0.dvm
>_0_CheapBastard_0.dvdd
>_0_CheapBastard_0.dvdm
>_0_Direct_0.doc
>_0_Direct_0.pay
>_0_Direct_0.pos
>_0_Direct_0.tim
>_0_Direct_0.tip
>_0_Lucene41_0.doc
>_0_Lucene41_0.pay
>_0_Lucene41_0.pos
>_0_Lucene41_0.tim
>_0_Lucene41_0.tip
>_0_Lucene42_0.dvd
>_0_Lucene42_0.dvm
>_0_Memory_0.ram
>_0_SimpleText_0.dat
>_0_SimpleText_0.pst
>segments.gen
>segments_2
>   at
>__randomizedtesting.SeedInfo.seed([B00E90C39659DDF1:41A53BC4769E4F38]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at
>org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames(TestBackwardsCompatibility3x.java:656)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:491)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>   at
>org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>   at
>org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>   at
>org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>   at
>com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   at
>org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>   at
>org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>   at
>org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   at
>com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at
>com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>   at
>com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>   at
>com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>   at
>com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>   at
>org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>   at
>org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>   at
>com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   at
>com.carrot

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b93) - Build # 6067 - Still Failing!

2013-06-14 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6067/
Java: 32bit/jdk1.8.0-ea-b93 -client -XX:+UseParallelGC

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames

Error Message:
incorrect filenames in index: expected: _0.cfe _0.cfs _0.si 
_0_1.del segments.gen segments_2  or _0.cfe _0.cfs _0.si 
_0_1.liv segments.gen segments_2  actual: _0.fdt _0.fdx 
_0.fnm _0.nvd _0.nvm _0.si _0.tvd _0.tvx _0_1.del 
_0_Asserting_0.dvd _0_Asserting_0.dvm _0_CheapBastard_0.dvdd 
_0_CheapBastard_0.dvdm _0_Direct_0.doc _0_Direct_0.pay 
_0_Direct_0.pos _0_Direct_0.tim _0_Direct_0.tip _0_Lucene41_0.doc   
  _0_Lucene41_0.pay _0_Lucene41_0.pos _0_Lucene41_0.tim 
_0_Lucene41_0.tip _0_Lucene42_0.dvd _0_Lucene42_0.dvm 
_0_Memory_0.ram _0_SimpleText_0.dat _0_SimpleText_0.pst 
segments.gen segments_2

Stack Trace:
java.lang.AssertionError: incorrect filenames in index: expected:
_0.cfe
_0.cfs
_0.si
_0_1.del
segments.gen
segments_2
 or _0.cfe
_0.cfs
_0.si
_0_1.liv
segments.gen
segments_2
 actual:
_0.fdt
_0.fdx
_0.fnm
_0.nvd
_0.nvm
_0.si
_0.tvd
_0.tvx
_0_1.del
_0_Asserting_0.dvd
_0_Asserting_0.dvm
_0_CheapBastard_0.dvdd
_0_CheapBastard_0.dvdm
_0_Direct_0.doc
_0_Direct_0.pay
_0_Direct_0.pos
_0_Direct_0.tim
_0_Direct_0.tip
_0_Lucene41_0.doc
_0_Lucene41_0.pay
_0_Lucene41_0.pos
_0_Lucene41_0.tim
_0_Lucene41_0.tip
_0_Lucene42_0.dvd
_0_Lucene42_0.dvm
_0_Memory_0.ram
_0_SimpleText_0.dat
_0_SimpleText_0.pst
segments.gen
segments_2
at 
__randomizedtesting.SeedInfo.seed([B00E90C39659DDF1:41A53BC4769E4F38]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.lucene.index.TestBackwardsCompatibility3x.testExactFileNames(TestBackwardsCompatibility3x.java:656)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
c

[jira] [Updated] (LUCENE-5006) Simplify / understand IndexWriter/DocumentsWriter synchronization

2013-06-14 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5006:


Attachment: LUCENE-5006.patch

here is a first/rough cut with some nocommits but it shows the idea. This patch 
removes all dependencies to DocumentsWriter and IndexWriter from DWPT. I 
factored everything out and made DWPT basically a non-reuseable class. It 
initializes itself entirely in the ctor including the segment name etc. 

At the same time DWPTThreadPool doesn't initialize DWPT anymore it only handles 
ThreadStates and pooling and the actual DWPT creation is done on the 
ThreadPools consumer level. This makes the pool much simpler as well. 

DocumentsWriter doesn't communicate with the IW directly. It only creates 
certain events that will be processed by the IW once the DW action is done. 
This works since all the Events are kind of atomic operations which can be 
executed once we exit DW. For instance if we flush a segment the DW doesn't 
push it up to the IW but once DW returns IW proceses the event an pulls the new 
flushed segment in. The same is true for pending merges etc.

This simplifies the locking a lot here since we basically can't deadlock 
anymore from DW or DWPT since we don't even have a reference to IW anymore. The 
only remaining thing is when we create a new DWPT we need to call into IW to 
get a new seg. name but it's a start.

> Simplify / understand IndexWriter/DocumentsWriter synchronization
> -
>
> Key: LUCENE-5006
> URL: https://issues.apache.org/jira/browse/LUCENE-5006
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
> Attachments: LUCENE-5006.patch
>
>
> The concurrency in IW/DW/BD is terrifying: there are many locks involved, not 
> just intrinsic locks but IW also has fullFlushLock, commitLock, and there are 
> no clear rules about lock order to avoid deadlocks like LUCENE-5002.
> We have to somehow simplify this, and define the allowed concurrent behavior 
> eg when an app calls deleteAll while other threads are indexing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5006) Simplify / understand IndexWriter/DocumentsWriter synchronization

2013-06-14 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned LUCENE-5006:
---

Assignee: Simon Willnauer

> Simplify / understand IndexWriter/DocumentsWriter synchronization
> -
>
> Key: LUCENE-5006
> URL: https://issues.apache.org/jira/browse/LUCENE-5006
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Michael McCandless
>Assignee: Simon Willnauer
>
> The concurrency in IW/DW/BD is terrifying: there are many locks involved, not 
> just intrinsic locks but IW also has fullFlushLock, commitLock, and there are 
> no clear rules about lock order to avoid deadlocks like LUCENE-5002.
> We have to somehow simplify this, and define the allowed concurrent behavior 
> eg when an app calls deleteAll while other threads are indexing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5029) factor out a generic 'TermState' for better sharing in FST-based term dict

2013-06-14 Thread Han Jiang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683723#comment-13683723
 ] 

Han Jiang commented on LUCENE-5029:
---

{quote}
Also, at write time, it'd be nice if the PBF got a DataOutput +
LongsRef from the terms dict which it could write the term data to,
and a "matching" DataInput + LongsRef at read time. I'm not sure
these even need to reside in BlockTermState?
{quote}

Yes, we can move those byte[] and DataI/O into TermState, so that the API
will be simpler. But with this change, the concept of 'TermState' needs more
explanation & javadocs now.

> factor out a generic 'TermState' for better sharing in FST-based term dict
> --
>
> Key: LUCENE-5029
> URL: https://issues.apache.org/jira/browse/LUCENE-5029
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Han Jiang
>Assignee: Han Jiang
>Priority: Minor
> Fix For: 4.4
>
> Attachments: LUCENE-5029.algebra.patch, LUCENE-5029.patch, 
> LUCENE-5029.patch, LUCENE-5029.patch
>
>
> Currently, those two FST-based term dict (memory codec & blocktree) all use 
> FST as a base data structure, this might not share much data in 
> parent arcs, since the encoded BytesRef doesn't guarantee that 
> 'Outputs.common()' always creates a long prefix. 
> While for current postings format, it is guaranteed that each FP (pointing to 
> .doc, .pos, etc.) will increase monotonically with 'larger' terms. That 
> means, between two Outputs, the Outputs from smaller term can be safely 
> pushed towards root. However we always have some tricky TermState to deal 
> with (like the singletonDocID for pulsing trick), so as Mike suggested, we 
> can simply cut the whole TermState into two parts: one part for comparation 
> and intersection, another for restoring generic data. Then the data structure 
> will be clear: this generic 'TermState' will consist of a fixed-length 
> LongsRef and variable-length BytesRef. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5029) factor out a generic 'TermState' for better sharing in FST-based term dict

2013-06-14 Thread Han Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Han Jiang updated LUCENE-5029:
--

Attachment: LUCENE-5029.patch
LUCENE-5029.algebra.patch

Just got rid of the hairy generalization :) 
Here I just copy the BlockTreeTerms* + Posting*Base + Lucene41Postings* to 
create a temporary block based codec: TempBlock, to iterate 
the new design.

The fail test in last patch comes from the reuse of TermState: 
we have to deep copy TermMetaData as well so that with
multi-thread, the same TermMetaData won't be modified 
simultaneously. This is somewhat sad because reusing itself
creates new objects. But we can leave that issue later.

Current version of 'LUCENE-5029.patch' will work on latest trunk.
But it is too long to review... So I just create a subset in 
'LUCENE-5029.algebra.patch', I think you can just review on this, Mike.

The ideas are just the same in my last comment: 
1. Put those algebra operations to MetaData, 
   so that PF will customize them.
2. Move those readTermBlock & flushBlock & buffering stuff to 
   term dict side, so that we have cleaner PF and pluggable PostingsBase
To simplify codes, I haven't use long[] and byte[] here, and 
I'll implement that read() in MetaData later.


> factor out a generic 'TermState' for better sharing in FST-based term dict
> --
>
> Key: LUCENE-5029
> URL: https://issues.apache.org/jira/browse/LUCENE-5029
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Han Jiang
>Assignee: Han Jiang
>Priority: Minor
> Fix For: 4.4
>
> Attachments: LUCENE-5029.algebra.patch, LUCENE-5029.patch, 
> LUCENE-5029.patch, LUCENE-5029.patch
>
>
> Currently, those two FST-based term dict (memory codec & blocktree) all use 
> FST as a base data structure, this might not share much data in 
> parent arcs, since the encoded BytesRef doesn't guarantee that 
> 'Outputs.common()' always creates a long prefix. 
> While for current postings format, it is guaranteed that each FP (pointing to 
> .doc, .pos, etc.) will increase monotonically with 'larger' terms. That 
> means, between two Outputs, the Outputs from smaller term can be safely 
> pushed towards root. However we always have some tricky TermState to deal 
> with (like the singletonDocID for pulsing trick), so as Mike suggested, we 
> can simply cut the whole TermState into two parts: one part for comparation 
> and intersection, another for restoring generic data. Then the data structure 
> will be clear: this generic 'TermState' will consist of a fixed-length 
> LongsRef and variable-length BytesRef. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066 - Still Failing!

2013-06-14 Thread Simon Willnauer

thanks david for taking a look. Can you maybe @Ignore the test until you
get it fixed unless you are prioritising this. :)

simon


On Fri, Jun 14, 2013 at 9:47 PM, David Smiley (@MITRE.org) <
dsmi...@mitre.org> wrote:

> Darn; I haven't noticed these failures.  I'll investigate.  I need to set
> up
> some sort of email filter alert system so they come to my attention
> immediately.
>
> (without knowing what the bug is; it's most likely in the complicated
> test).
>
> ~ David
>
>
> Policeman Jenkins Server-2 wrote
> > Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6066/
> > Java: 32bit/jdk1.6.0_45 -server -XX:+UseSerialGC
> >
> > 1 tests failed.
> > FAILED:
> >
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains
> > {#1 seed=[9166D28D6532217A:472BE5C4B7344982]}
> >
> > Error Message:
> > Shouldn't match I
> > #0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
> > Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
> > Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
> >
> > Stack Trace:
> > java.lang.AssertionError: Shouldn't match I
> > #0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
> > Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
> > Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
> >   at
> > __randomizedtesting.SeedInfo.seed([9166D28D6532217A:472BE5C4B7344982]:0)
> >   at org.junit.Assert.fail(Assert.java:93)
> >   at
> >
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287)
> >   at
> >
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273)
> >   at
> >
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101)
> >   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >   at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >   at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >   at java.lang.reflect.Method.invoke(Method.java:597)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
> >   at
> >
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> >   at
> >
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> >   at
> >
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> >   at
> >
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> >   at
> >
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> >   at
> >
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> >   at
> >
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> >   at
> >
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> >   at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> >   at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> >   at
> >
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> >   at
> >
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> >   at
> >
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> >   at
> >
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> >   at
> >
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> >   at
> >
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> >   at
> >
> com.carrotsearch.randomizedtesting.ru

[jira] [Updated] (SOLR-4805) SolrCloud - RELOAD on collections or cores leaves collection offline and unusable till restart

2013-06-14 Thread Shawn Heisey (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Heisey updated SOLR-4805:
---

Summary: SolrCloud - RELOAD on collections or cores leaves collection 
offline and unusable till restart  (was: Calling Collection RELOAD where 
collection has a single core, leaves collection offline and unusable till 
reboot)

> SolrCloud - RELOAD on collections or cores leaves collection offline and 
> unusable till restart
> --
>
> Key: SOLR-4805
> URL: https://issues.apache.org/jira/browse/SOLR-4805
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Jared Rodriguez
>Assignee: Mark Miller
> Fix For: 5.0, 4.4
>
> Attachments: SOLR-4805.patch
>
>
> If you have a collection that is composed of a single core, then calling 
> reload on that collection leaves the core offline.  This happens even if 
> nothing at all has changed about the collection or its config.  This happens 
> whether you call reload via an http GET or if you directly call reload via 
> the collections api. 
> Tried a collection with a single core that contains data, change nothing 
> about the config in ZK and call reload and the collection.  The call 
> completes, but ZK flags that replica with "state":"down"
> Try it where a the single core contains no data and the same thing happens, 
> ZK config updates and broadcasts "state":"down" for the replica.
> I did not try this in a multicore or replicated core environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066 - Still Failing!

2013-06-14 Thread David Smiley (@MITRE.org)

Darn; I haven't noticed these failures.  I'll investigate.  I need to set up
some sort of email filter alert system so they come to my attention
immediately.

(without knowing what the bug is; it's most likely in the complicated test).

~ David


Policeman Jenkins Server-2 wrote
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6066/
> Java: 32bit/jdk1.6.0_45 -server -XX:+UseSerialGC
> 
> 1 tests failed.
> FAILED: 
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains
> {#1 seed=[9166D28D6532217A:472BE5C4B7344982]}
> 
> Error Message:
> Shouldn't match I
> #0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
> Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
> Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
> 
> Stack Trace:
> java.lang.AssertionError: Shouldn't match I
> #0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
> Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
> Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
>   at
> __randomizedtesting.SeedInfo.seed([9166D28D6532217A:472BE5C4B7344982]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287)
>   at
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273)
>   at
> org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>   at
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>   at
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>   at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>   at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   at
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>   at
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>   at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>   at
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>   at
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>   at
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>   at
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>   at
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>   at
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>   at
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>   at
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>   at
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>   at
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   at
> org.apache.lucene.util.TestRuleIgnoreAfterMax

[jira] [Comment Edited] (SOLR-2305) DataImportScheduler

2013-06-14 Thread Smita Raval (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683700#comment-13683700
 ] 

Smita Raval edited comment on SOLR-2305 at 6/14/13 7:42 PM:


I would also vote for DIH scheduler. 
It would be really nice to add enable parameter to DIH defaults, similar to 
replicationHandler. 
So, can keep the same config for master and slaves and triggering dataimport 
only if master is enabled.

  was (Author: sraval):
I would also vote for DataImport scheduler. 
It would be really nice to add enable parameter to DIH defaults, similar to 
replicationHandler. 
So, can keep the same config for master and slaves and triggering dataimport 
only if master is enabled.
  
> DataImportScheduler
> ---
>
> Key: SOLR-2305
> URL: https://issues.apache.org/jira/browse/SOLR-2305
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0-ALPHA
>Reporter: Bill Bell
> Fix For: 4.4
>
> Attachments: patch.txt, SOLR-2305-1.diff
>
>
> Marko Bonaci has updated the WIKI page to add the DataImportScheduler, but I 
> cannot find a JIRA ticket for it?
> http://wiki.apache.org/solr/DataImportHandler
> Do we have a ticket so the code can be tracked?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2305) DataImportScheduler

2013-06-14 Thread Smita Raval (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683700#comment-13683700
 ] 

Smita Raval commented on SOLR-2305:
---

I would also vote for DataImport scheduler. 
It would be really nice to add enable parameter to DIH defaults, similar to 
replicationHandler. 
So, can keep the same config for master and slaves and triggering dataimport 
only if master is enabled.

> DataImportScheduler
> ---
>
> Key: SOLR-2305
> URL: https://issues.apache.org/jira/browse/SOLR-2305
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 4.0-ALPHA
>Reporter: Bill Bell
> Fix For: 4.4
>
> Attachments: patch.txt, SOLR-2305-1.diff
>
>
> Marko Bonaci has updated the WIKI page to add the DataImportScheduler, but I 
> cannot find a JIRA ticket for it?
> http://wiki.apache.org/solr/DataImportHandler
> Do we have a ticket so the code can be tracked?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5038.
-

   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 5.0, 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066 - Still Failing!

2013-06-14 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6066/
Java: 32bit/jdk1.6.0_45 -server -XX:+UseSerialGC

1 tests failed.
FAILED:  
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains 
{#1 seed=[9166D28D6532217A:472BE5C4B7344982]}

Error Message:
Shouldn't match I 
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) , 
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0)) 
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)

Stack Trace:
java.lang.AssertionError: Shouldn't match I 
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) , 
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0)) 
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
at 
__randomizedtesting.SeedInfo.seed([9166D28D6532217A:472BE5C4B7344982]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287)
at 
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273)
at 
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementR

[jira] [Updated] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5038:


Affects Version/s: 5.0
Fix Version/s: 5.0

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 5.0, 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683694#comment-13683694
 ] 

Commit Tag Bot commented on LUCENE-5038:


[branch_4x commit] simonw
http://svn.apache.org/viewvc?view=revision&revision=1493225

LUCENE-5038: Refactor CompoundFile settings in MergePolicy and IndexWriterConfig

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 546 - Failure!

2013-06-14 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/546/
Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
REGRESSION:  org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary

Error Message:
IOException occured when talking to server at: 
https://127.0.0.1:53571/solr/collection1

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: https://127.0.0.1:53571/solr/collection1
at 
__randomizedtesting.SeedInfo.seed([A8776DEF499B44B7:FA0123F0CE749EA4]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:435)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168)
at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146)
at 
org.apache.solr.client.solrj.TestBatchUpdate.doIt(TestBatchUpdate.java:130)
at 
org.apache.solr.client.solrj.TestBatchUpdate.testWithBinary(TestBatchUpdate.java:62)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnor

[jira] [Commented] (SOLR-4403) The CoreAdminRequest.Create class does not support replicationFactor, maxShardsPerNode parameters

2013-06-14 Thread Shawn Heisey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683650#comment-13683650
 ] 

Shawn Heisey commented on SOLR-4403:


The replicationFactor and maxShardsPerNode parameters are for collections, not 
cores.  They have no meaning in the context of creating cores.

As far as I can tell, there is not a CollectionsAdminRequest class at this time.


> The CoreAdminRequest.Create class does not support replicationFactor, 
> maxShardsPerNode parameters
> -
>
> Key: SOLR-4403
> URL: https://issues.apache.org/jira/browse/SOLR-4403
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 4.1
>Reporter: Colin Bartolome
>Priority: Minor
> Fix For: 4.4
>
>
> The CoreAdminRequest.Create class does not support the replicationFactor or 
> maxShardsPerNode parameters, forcing me to build up a set of parameters and a 
> SolrRequest object by hand. (There may be other parameters that are also not 
> supported by that class and the other classes within CoreAdminRequest.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5056) Indexing non-point shapes close to the poles doesn't scale

2013-06-14 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683606#comment-13683606
 ] 

David Smiley commented on LUCENE-5056:
--

The WKT spec says counter-clockwise order for the outer shell, and Spatial4j 
demands that for rectangles expressed as polygons.  A lot of software 
(OpenLayers, JTS, PostGIS) doesn't care and lets you do it however you want, 
even though technically the shape is ambiguous (which part of the ring is the 
inside vs the outside?).  This is in the FAQ on Solr's wiki.  In the next 
version of Spatial4j I'll make it support both.

> Indexing non-point shapes close to the poles doesn't scale
> --
>
> Key: LUCENE-5056
> URL: https://issues.apache.org/jira/browse/LUCENE-5056
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial
>Affects Versions: 4.3
>Reporter: Hal Deadman
> Attachments: indexed circle close to the pole.png
>
>
> From: [~hdeadman]
> We are seeing an issue where certain shapes are causing Solr to use up all 
> available heap space when a record with one of those shapes is indexed. We 
> were indexing polygons where we had the points going clockwise instead of 
> counter-clockwise and the shape would be so large that we would run out of 
> memory. We fixed those shapes but we are seeing this circle eat up about 
> 700MB of memory before we get an OutOfMemory error (heap space) with a 1GB 
> JVM heap.
> Circle(3.0 90 d=0.0499542757922153)
> Google Earth can't plot that circle either, maybe it is invalid or too close 
> to the north pole due to the latitude of 90, but it would be nice if there 
> was a way for shapes to be validated before they cause an OOM error.
> The objects (4.5 million) are all GeohashPrefixTree$GhCell objects in an 
> ArrayList owned by PrefixTreeStrategy$CellTokenStream.
> Is there anyway to have a max number of cells in a shape before it is 
> considered too large and is not indexed? Is there a geo library that could 
> validate the shape as being reasonably sized and bounded before it is 
> processed?
> We are currently using Solr 4.1.
>  class="solr.SpatialRecursivePrefixTreeFieldType"
> spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
> geo="true" distErrPct="0.025" maxDistErr="0.09" units="degrees" />

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4740) Weak references cause extreme GC churn

2013-06-14 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-4740:
-

Assignee: Uwe Schindler

> Weak references cause extreme GC churn
> --
>
> Key: LUCENE-4740
> URL: https://issues.apache.org/jira/browse/LUCENE-4740
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 3.6.1
> Environment: Linux debian squeeze 64 bit, Oracle JDK 6, 32 GB RAM, 16 
> cores
> -Xmx16G
>Reporter: Kristofer Karlsson
>Assignee: Uwe Schindler
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-4740.patch, LUCENE-4740.patch
>
>
> We are running a set of independent search machines, running our custom 
> software using lucene as a search library. We recently upgraded from lucene 
> 3.0.3 to 3.6.1 and noticed a severe degradation of performance.
> After doing some heap dump digging, it turns out the process is stalling 
> because it's spending so much time in GC. We noticed about 212 million 
> WeakReference, originating from WeakIdentityMap, originating from 
> MMapIndexInput.
> Our problem completely went away after removing the clones weakhashmap from 
> MMapIndexInput, and as a side-effect, disabling support for explictly 
> unmapping the mmapped data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4740) Weak references cause extreme GC churn

2013-06-14 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-4740.
---

   Resolution: Fixed
Fix Version/s: 5.0

> Weak references cause extreme GC churn
> --
>
> Key: LUCENE-4740
> URL: https://issues.apache.org/jira/browse/LUCENE-4740
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 3.6.1
> Environment: Linux debian squeeze 64 bit, Oracle JDK 6, 32 GB RAM, 16 
> cores
> -Xmx16G
>Reporter: Kristofer Karlsson
> Fix For: 5.0, 4.4
>
> Attachments: LUCENE-4740.patch, LUCENE-4740.patch
>
>
> We are running a set of independent search machines, running our custom 
> software using lucene as a search library. We recently upgraded from lucene 
> 3.0.3 to 3.6.1 and noticed a severe degradation of performance.
> After doing some heap dump digging, it turns out the process is stalling 
> because it's spending so much time in GC. We noticed about 212 million 
> WeakReference, originating from WeakIdentityMap, originating from 
> MMapIndexInput.
> Our problem completely went away after removing the clones weakhashmap from 
> MMapIndexInput, and as a side-effect, disabling support for explictly 
> unmapping the mmapped data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5056) Indexing non-point shapes close to the poles doesn't scale

2013-06-14 Thread Hal Deadman (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683596#comment-13683596
 ] 

Hal Deadman commented on LUCENE-5056:
-

This might not be the same issue, but we have a small rectangle that uses a 
really large amount of memory:

POLYGON((1.025 1.025, 1.025 1.101, 1.101 1.101, 1.101 1.025, 1.025 1.025))

If we change it just a little we don't get out of memory errors:

POLYGON((1.025001 1.025, 1.025 1.101, 1.101 1.101, 1.101 1.025, 1.025001 1.025))


> Indexing non-point shapes close to the poles doesn't scale
> --
>
> Key: LUCENE-5056
> URL: https://issues.apache.org/jira/browse/LUCENE-5056
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial
>Affects Versions: 4.3
>Reporter: Hal Deadman
> Attachments: indexed circle close to the pole.png
>
>
> From: [~hdeadman]
> We are seeing an issue where certain shapes are causing Solr to use up all 
> available heap space when a record with one of those shapes is indexed. We 
> were indexing polygons where we had the points going clockwise instead of 
> counter-clockwise and the shape would be so large that we would run out of 
> memory. We fixed those shapes but we are seeing this circle eat up about 
> 700MB of memory before we get an OutOfMemory error (heap space) with a 1GB 
> JVM heap.
> Circle(3.0 90 d=0.0499542757922153)
> Google Earth can't plot that circle either, maybe it is invalid or too close 
> to the north pole due to the latitude of 90, but it would be nice if there 
> was a way for shapes to be validated before they cause an OOM error.
> The objects (4.5 million) are all GeohashPrefixTree$GhCell objects in an 
> ArrayList owned by PrefixTreeStrategy$CellTokenStream.
> Is there anyway to have a max number of cells in a shape before it is 
> considered too large and is not indexed? Is there a geo library that could 
> validate the shape as being reasonably sized and bounded before it is 
> processed?
> We are currently using Solr 4.1.
>  class="solr.SpatialRecursivePrefixTreeFieldType"
> spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
> geo="true" distErrPct="0.025" maxDistErr="0.09" units="degrees" />

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4740) Weak references cause extreme GC churn

2013-06-14 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683594#comment-13683594
 ] 

Uwe Schindler commented on LUCENE-4740:
---

bq. Therefore I have a simple question. If I run Solr branch_4x with this patch 
applied, will I benefit? I can see from the commit log that unmmapping must 
disabled to benefit, but I don't know if this is how Solr operates.

If you have a version with this patch enabled and you are using an index that 
changes not too often, it is better to not unmap. In that case you can pass 
unmap="false" as parameter to your MMapDirectoryFactory in Solr. If the index 
does not change too often, the overhead by a delay in unmapping the files at a 
lter stage does not matter. GC has less to do then.

> Weak references cause extreme GC churn
> --
>
> Key: LUCENE-4740
> URL: https://issues.apache.org/jira/browse/LUCENE-4740
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: 3.6.1
> Environment: Linux debian squeeze 64 bit, Oracle JDK 6, 32 GB RAM, 16 
> cores
> -Xmx16G
>Reporter: Kristofer Karlsson
> Fix For: 4.4
>
> Attachments: LUCENE-4740.patch, LUCENE-4740.patch
>
>
> We are running a set of independent search machines, running our custom 
> software using lucene as a search library. We recently upgraded from lucene 
> 3.0.3 to 3.6.1 and noticed a severe degradation of performance.
> After doing some heap dump digging, it turns out the process is stalling 
> because it's spending so much time in GC. We noticed about 212 million 
> WeakReference, originating from WeakIdentityMap, originating from 
> MMapIndexInput.
> Our problem completely went away after removing the clones weakhashmap from 
> MMapIndexInput, and as a side-effect, disabling support for explictly 
> unmapping the mmapped data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4927) search pattern issues

2013-06-14 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-4927.
--

Resolution: Invalid

Please discuss this question on the user's list to ascertain whether it's a 
real bug or something in your configuration before raising a JIRA. You can 
re-open this if it's determined that this is really a problem in Solr.

> search pattern issues
> -
>
> Key: SOLR-4927
> URL: https://issues.apache.org/jira/browse/SOLR-4927
> Project: Solr
>  Issue Type: Task
>  Components: search
>Affects Versions: 3.5
> Environment: linux,production
>Reporter: raghu
>Priority: Critical
>
> We noticed when searching with cn6 in our site there were no results been 
> showing.
> Noticed the cn6 pattern is being changed to cnc,below were the related search 
> queries.
> Can you please let us know how can we make this work.
> Input:
> q=(crossReference1_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference2_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference3_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference4_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference5_string:(cn6*^30.0 OR cn6^60.0)) OR (code_string:(cn6^20.0 OR 
> cn6*^10.0)) OR (name_text_en:(cn6*^60.0 OR cn6^120.0)) OR 
> (productcategoryName_text_en_mv:(cn6^20.0 OR cn6*^10.0)) OR 
> (description_text_en:(cn6^100.0 OR cn6*^50.0)) OR (mpn_text:(cn6*^60.0 OR 
> cn6^120.0)) OR (milSpecNumber_string:(cn6^80.0 OR cn6*^40.0)) OR 
> (supplierName_text:(cn6*^20.0 OR cn6^40.0)) OR 
> (supplierCode_string:(cn6*^80.0 OR 
> cn6^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cn6&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
>  desc,onlineDate_date desc
> Output:
> q=(crossReference1_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference2_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference3_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference4_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference5_string:(cnc^60.0 OR cnc*^30.0)) OR (code_string:(cnc^20.0 OR 
> cnc*^10.0)) OR (name_text_en:(cnc*^60.0 OR cnc^120.0)) OR 
> (productcategoryName_text_en_mv:(cnc^20.0 OR cnc*^10.0)) OR 
> (description_text_en:(cnc^100.0 OR cnc*^50.0)) OR (mpn_text:(cnc*^60.0 OR 
> cnc^120.0)) OR (milSpecNumber_string:(cnc*^40.0 OR cnc^80.0)) OR 
> (supplierName_text:(cnc*^20.0 OR cnc^40.0)) OR 
> (supplierCode_string:(cnc*^80.0 OR 
> cnc^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cnc&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
>  desc,onlineDate_date desc
> Regards,
> Raghu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683499#comment-13683499
 ] 

Christian commented on LUCENE-5058:
---

Okay, thanks!

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian resolved LUCENE-5058.
---

Resolution: Not A Problem

No error, expected behaviour on windows.

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683469#comment-13683469
 ] 

Michael McCandless commented on LUCENE-5058:


This happens because when you first open an NRT reader from the IndexWriter, it 
pools open SegmentReaders for all segments, even after you've closed your 
searcher.

Once you close the IndexWriter the files should be deleted.

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: LUCENE-2145, How to proceed on Tokenizer.close()

2013-06-14 Thread Robert Muir

i still think we can fix this without changing APIs: something like just
having the tokenizer close the reader in end() instead of close().

Then remove close() from the consumer lifecycle (its only called e.g. when
analyzer.close() releases its CLoseableThreadLocals of reused streams).

Index: lucene/core/src/java/org/apache/lucene/analysis/Tokenizer.java
===
--- lucene/core/src/java/org/apache/lucene/analysis/Tokenizer.java
(revision 1491309)
+++ lucene/core/src/java/org/apache/lucene/analysis/Tokenizer.java
(working copy)
@@ -47,6 +47,10 @@
 this.input = input;
   }

+  @Override
+  public void close() throws IOException {
+  }
+
   /**
* {@inheritDoc}
* 
@@ -55,7 +59,8 @@
* be sure to call super.close() when overriding this
method.
*/
   @Override
-  public void close() throws IOException {
+  public void end() throws IOException {
+super.end();
 if (input != null) {
   input.close();
   // LUCENE-2387: don't hold onto Reader after close, so


On Fri, Jun 14, 2013 at 8:13 AM, Benson Margulies wrote:

> Since LUCENE-2145 is of specific practical interest to me, I'd like to
> pitch in.
>
> However, it raises a bit of a compatibility thicket.
>
> There are, I think, two rough approaches:
>
> 1) redefine close() on TokenStream to have the less surprising
> semantics of 'don't use this object after it has been closed', and
> change existing callers to close the reader for themselves, or provide
> a closeReader().
>
> 2) leave close() as is, and define, well, 'reallyCloseIMeanIt()'.
>
> I'm happy to do grunt work here, but I'm obviously not the person to
> decide which of these is appropriate and tasteful.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (SOLR-4872) Allow schema analysis object factories to be cleaned up properly when the core shuts down

2013-06-14 Thread Benson Margulies (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683443#comment-13683443
 ] 

Benson Margulies commented on SOLR-4872:


Originally, Hoss documented some disquiet with using SolrCoreAware. If I add 
you two together and divide, I come up with the following proposal ...

# what you can see in the patch so far; a mechanism to get lifecycle awareness 
into the schema object.
# instead of a close method on the factories, allow them to implement a new 
interface, SchemaComponentLifecycle. This would achieve Hoss' goal of avoiding 
tangling schema and core. Then IndexSchema would invoke via this interface upon 
schema teardown.

Ultimately, I have to navigate amongst you all, so I will await further 
exchange of views.


> Allow schema analysis object factories to be cleaned up properly when the 
> core shuts down
> -
>
> Key: SOLR-4872
> URL: https://issues.apache.org/jira/browse/SOLR-4872
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Benson Margulies
> Attachments: solr-4872.patch, solr-4872.patch
>
>
> I have a need, in an TokenizerFactory or TokenFilterFactory, to have a shared 
> cache that is cleaned up when the core is torn down. 
> There is no 'close' protocol on these things, and Solr rejects analysis 
> components that are SolrCoreAware. 
> Possible solutions:
> # add a close protocol to these factories and make sure it gets called at 
> core shutdown.
> # allow these items to be 'core-aware'.
> # invent some notion of 'schema-lifecycle-aware'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4928) Configure UnInvertedField to skip terms with too high or too low document frequency

2013-06-14 Thread David Smiley (JIRA)

David Smiley created SOLR-4928:
--

 Summary: Configure UnInvertedField to skip terms with too high or 
too low document frequency
 Key: SOLR-4928
 URL: https://issues.apache.org/jira/browse/SOLR-4928
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Reporter: David Smiley


I want to facet on my tokenized text for tag clouds and for analytical 
purposes.  Even though I only have 312k docs, UnInvertedField hit a limit -- 
"Too many values for UnInvertedField faceting on field text".  I guess some of 
these docs are bigger than I thought and have lots of distinct terms; I dunno.  

I'd like to add a new parameter named something like facet.uif.cache.minDf  
(named similarly to the existing facet.enum.cache.minDf).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Created] (SOLR-4927) search pattern issues

2013-06-14 Thread Jack Krupansky

Please pursue this on the Solr user mailing list first - unless you have 
specific reasons that there is a bug in Solr - and I would note that you 
offered no indications  in your description that there is an actual problem 
with Solr itself.


-- Jack Krupansky

-Original Message- 
From: raghu (JIRA)

Sent: Friday, June 14, 2013 9:43 AM
To: dev@lucene.apache.org
Subject: [jira] [Created] (SOLR-4927) search pattern issues

raghu created SOLR-4927:
---

Summary: search pattern issues
Key: SOLR-4927
URL: https://issues.apache.org/jira/browse/SOLR-4927
Project: Solr
 Issue Type: Task
 Components: search
   Affects Versions: 1.3
Environment: linux,production
   Reporter: raghu
   Priority: Critical



We noticed when searching with cn6 in our site there were no results been 
showing.
Noticed the cn6 pattern is being changed to cnc,below were the related 
search queries.

Can you please let us know how can we make this work.


Input:

q=(crossReference1_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference2_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference3_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference4_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference5_string:(cn6*^30.0 OR cn6^60.0)) OR (code_string:(cn6^20.0 
OR cn6*^10.0)) OR (name_text_en:(cn6*^60.0 OR cn6^120.0)) OR 
(productcategoryName_text_en_mv:(cn6^20.0 OR cn6*^10.0)) OR 
(description_text_en:(cn6^100.0 OR cn6*^50.0)) OR (mpn_text:(cn6*^60.0 OR 
cn6^120.0)) OR (milSpecNumber_string:(cn6^80.0 OR cn6*^40.0)) OR 
(supplierName_text:(cn6*^20.0 OR cn6^40.0)) OR 
(supplierCode_string:(cn6*^80.0 OR 
cn6^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cn6&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int 
desc,onlineDate_date desc


Output:

q=(crossReference1_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference2_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference3_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference4_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference5_string:(cnc^60.0 OR cnc*^30.0)) OR (code_string:(cnc^20.0 
OR cnc*^10.0)) OR (name_text_en:(cnc*^60.0 OR cnc^120.0)) OR 
(productcategoryName_text_en_mv:(cnc^20.0 OR cnc*^10.0)) OR 
(description_text_en:(cnc^100.0 OR cnc*^50.0)) OR (mpn_text:(cnc*^60.0 OR 
cnc^120.0)) OR (milSpecNumber_string:(cnc*^40.0 OR cnc^80.0)) OR 
(supplierName_text:(cnc*^20.0 OR cnc^40.0)) OR 
(supplierCode_string:(cnc*^80.0 OR 
cnc^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cnc&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int 
desc,onlineDate_date desc


Regards,
Raghu.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 
administrators

For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4872) Allow schema analysis object factories to be cleaned up properly when the core shuts down

2013-06-14 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683402#comment-13683402
 ] 

Yonik Seeley commented on SOLR-4872:


bq. For individual objects of shorter lifetime, finalizers are more of a 
problem.

Right, but I'm not proposing adding any new finalizers in lucene/solr - I was 
proposing you solve your specific issue that way.

Here's the thing: guaranteeing that you call close() once and only once after 
all users are done with an object adds a lot of complexity and constrains 
implementation.  The schema sharing code is likely to change in the future esp 
as we move toward named schemas, and I imagine sharing would be the default (I 
can't imagine why it wouldn't be at least).  Modifiable schemas (adding fields 
/ field types) on the fly also complicate things, esp in the future if/when we 
are able to change an existing field type.  This would also constrain other 
optimizations we might do like share common analysis components between 
different field types.  So the more I think about it, the more it seems like a 
bad idea to have close() on FieldType and friends.

It seems fine to allow SolrCoreAware, and I could also see value in the 
addition of a CoreContainer.addShutdownHook() as well.  


> Allow schema analysis object factories to be cleaned up properly when the 
> core shuts down
> -
>
> Key: SOLR-4872
> URL: https://issues.apache.org/jira/browse/SOLR-4872
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Benson Margulies
> Attachments: solr-4872.patch, solr-4872.patch
>
>
> I have a need, in an TokenizerFactory or TokenFilterFactory, to have a shared 
> cache that is cleaned up when the core is torn down. 
> There is no 'close' protocol on these things, and Solr rejects analysis 
> components that are SolrCoreAware. 
> Possible solutions:
> # add a close protocol to these factories and make sure it gets called at 
> core shutdown.
> # allow these items to be 'core-aware'.
> # invent some notion of 'schema-lifecycle-aware'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4914) Refactor core persistence to reflect deprecating the tags in solr.xml

2013-06-14 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683363#comment-13683363
 ] 

Alan Woodward commented on SOLR-4914:
-

Thanks for opening this, Erick.

I think it makes sense to combine persistence and discovery, as they're 
basically two sides of the same coin.  So pre Solr 4.3, we loaded cores from 
solr.xml and persisted them back there.  Now we allow Solr to crawl the 
filesystem looking for core definition files and we persist these.

I'm thinking the combined interface should look something like this:

{code:java}
public interface CoreDiscoverer {

public List discover();

public void persist(CoreDescriptor cd);
public void delete(CoreDescriptor cd);

}
{code}

This also allows us to make discovery and persistence completely pluggable.  If 
I want to store my core definitions in an external DB, or a plain text file, or 
whatever, I implement a PlainTextCoreDiscoverer and drop it onto the classpath, 
and refer to it from solr.xml.

> Refactor core persistence to reflect deprecating the  tags in solr.xml
> 
>
> Key: SOLR-4914
> URL: https://issues.apache.org/jira/browse/SOLR-4914
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 5.0
>Reporter: Erick Erickson
> Attachments: SOLR-4914.patch
>
>
> Alan Woodward has done some work to refactor how core persistence works that 
> we should work on going forward that I want to separate from a shorter-term 
> tactical problem (See SOLR-4910).
> I'm attaching Alan's patch to this JIRA and we'll carry it forward separately 
> from 4910.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-4921:
--

Attachment: SOLR-4921.patch

CSV support, Raw Solr Command support.
commitWithin, overwrite support.

Boost for JSON support.

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683344#comment-13683344
 ] 

Christian edited comment on LUCENE-5058 at 6/14/13 1:48 PM:


There is no virus scanner and the lock is not held by explorer but by the java 
process itself. 
What do you mean with "this is not a problem"? Does this IOException affect the 
functionality of the index in any way?

  was (Author: mark_spoon):
There is no virus scanner and the lock is not held by explorer but by the 
java process. 
What do you mean with "this is not a problem"? Does this IOException affect the 
functionality of the index in any way?
  
> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4927) search pattern issues

2013-06-14 Thread raghu (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

raghu updated SOLR-4927:


Affects Version/s: (was: 1.3)
   3.5

> search pattern issues
> -
>
> Key: SOLR-4927
> URL: https://issues.apache.org/jira/browse/SOLR-4927
> Project: Solr
>  Issue Type: Task
>  Components: search
>Affects Versions: 3.5
> Environment: linux,production
>Reporter: raghu
>Priority: Critical
>
> We noticed when searching with cn6 in our site there were no results been 
> showing.
> Noticed the cn6 pattern is being changed to cnc,below were the related search 
> queries.
> Can you please let us know how can we make this work.
> Input:
> q=(crossReference1_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference2_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference3_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference4_string:(cn6*^30.0 OR cn6^60.0)) OR 
> (crossReference5_string:(cn6*^30.0 OR cn6^60.0)) OR (code_string:(cn6^20.0 OR 
> cn6*^10.0)) OR (name_text_en:(cn6*^60.0 OR cn6^120.0)) OR 
> (productcategoryName_text_en_mv:(cn6^20.0 OR cn6*^10.0)) OR 
> (description_text_en:(cn6^100.0 OR cn6*^50.0)) OR (mpn_text:(cn6*^60.0 OR 
> cn6^120.0)) OR (milSpecNumber_string:(cn6^80.0 OR cn6*^40.0)) OR 
> (supplierName_text:(cn6*^20.0 OR cn6^40.0)) OR 
> (supplierCode_string:(cn6*^80.0 OR 
> cn6^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cn6&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
>  desc,onlineDate_date desc
> Output:
> q=(crossReference1_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference2_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference3_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference4_string:(cnc^60.0 OR cnc*^30.0)) OR 
> (crossReference5_string:(cnc^60.0 OR cnc*^30.0)) OR (code_string:(cnc^20.0 OR 
> cnc*^10.0)) OR (name_text_en:(cnc*^60.0 OR cnc^120.0)) OR 
> (productcategoryName_text_en_mv:(cnc^20.0 OR cnc*^10.0)) OR 
> (description_text_en:(cnc^100.0 OR cnc*^50.0)) OR (mpn_text:(cnc*^60.0 OR 
> cnc^120.0)) OR (milSpecNumber_string:(cnc*^40.0 OR cnc^80.0)) OR 
> (supplierName_text:(cnc*^20.0 OR cnc^40.0)) OR 
> (supplierCode_string:(cnc*^80.0 OR 
> cnc^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cnc&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
>  desc,onlineDate_date desc
> Regards,
> Raghu.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4927) search pattern issues

2013-06-14 Thread raghu (JIRA)

raghu created SOLR-4927:
---

 Summary: search pattern issues
 Key: SOLR-4927
 URL: https://issues.apache.org/jira/browse/SOLR-4927
 Project: Solr
  Issue Type: Task
  Components: search
Affects Versions: 1.3
 Environment: linux,production
Reporter: raghu
Priority: Critical



We noticed when searching with cn6 in our site there were no results been 
showing.
Noticed the cn6 pattern is being changed to cnc,below were the related search 
queries.
Can you please let us know how can we make this work.


Input:

q=(crossReference1_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference2_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference3_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference4_string:(cn6*^30.0 OR cn6^60.0)) OR 
(crossReference5_string:(cn6*^30.0 OR cn6^60.0)) OR (code_string:(cn6^20.0 OR 
cn6*^10.0)) OR (name_text_en:(cn6*^60.0 OR cn6^120.0)) OR 
(productcategoryName_text_en_mv:(cn6^20.0 OR cn6*^10.0)) OR 
(description_text_en:(cn6^100.0 OR cn6*^50.0)) OR (mpn_text:(cn6*^60.0 OR 
cn6^120.0)) OR (milSpecNumber_string:(cn6^80.0 OR cn6*^40.0)) OR 
(supplierName_text:(cn6*^20.0 OR cn6^40.0)) OR (supplierCode_string:(cn6*^80.0 
OR 
cn6^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cn6&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
 desc,onlineDate_date desc

Output:

q=(crossReference1_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference2_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference3_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference4_string:(cnc^60.0 OR cnc*^30.0)) OR 
(crossReference5_string:(cnc^60.0 OR cnc*^30.0)) OR (code_string:(cnc^20.0 OR 
cnc*^10.0)) OR (name_text_en:(cnc*^60.0 OR cnc^120.0)) OR 
(productcategoryName_text_en_mv:(cnc^20.0 OR cnc*^10.0)) OR 
(description_text_en:(cnc^100.0 OR cnc*^50.0)) OR (mpn_text:(cnc*^60.0 OR 
cnc^120.0)) OR (milSpecNumber_string:(cnc*^40.0 OR cnc^80.0)) OR 
(supplierName_text:(cnc*^20.0 OR cnc^40.0)) OR (supplierCode_string:(cnc*^80.0 
OR 
cnc^160.0))&spellcheck=true&spellcheck.dictionary=en&spellcheck.collate=true&spellcheck.q=cnc&fq=(catalogId:"acalProductCatalogUK")&fq=(catalogVersion:Online)&start=0&rows=20&facet=true&facet.mincount=1&facet.limit=800&facet.sort=count&facet.field=category_string_mv&sort=lifecycleIndicator_int
 desc,onlineDate_date desc

Regards,
Raghu.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5057) Hunspell stemmer generates multiple tokens

2013-06-14 Thread Luca Cavanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-5057:
-

Description: 
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token. I'm not sure though if it's possible to have 
only a single stem indexed, which would be even better in my opinion. When I 
look at how snowball works only one token is indexed, the stem, and that works 
great. Probably there's something I'm missing in how hunspell works.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I also wonder if it can be a dictionary issue since I see 
that different words that have the word "fiets" as root don't get the same 
stems, and using the AND operator at query time is a big issue.

I would love to contribute on this and looking forward to your feedback.



  was:
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token. I'm not sure though if it's possible to have 
only a single stem indexed, which would be even better in my opinion. When I 
look at how snowball works only one token is indexed, the stem, and that works 
great. Probably there's something I'm missing in how hunspell works.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I wonder if it can be a dictionary issue since I see that 
different words that have the word "fiets" as root don't get the same stems, 
and using the AND operator at query time is a big issue.




> Hunspell stemmer generates multiple tokens
> --
>
> Key: LUCENE-5057
> URL: https://issues.apache.org/jira/browse/LUCENE-5057
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Luca Cavanna
>
> The hunspell stemmer seems to be generating multiple tokens: the original 
> token plus the available stems.
> It might be a good thing in some cases but it seems to be a different 
> behaviour compared to the other stemmers and causes problems as well. I would 
> rather have an option to decide whether it should output only the available 
> stems, or the stems plus the original token. I'm not sure though if it's 
> possible to have only a single stem indexed, which would be even better in my 
> opinion. When I look at how snowball works only one token is indexed, the 
> stem, and that works great. Probably there's something I'm missing in how 
> hunspell works.
> Here is my issue: I have a query composed of multiple terms, which i

[jira] [Updated] (LUCENE-5057) Hunspell stemmer generates multiple tokens

2013-06-14 Thread Luca Cavanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-5057:
-

Description: 
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token. I'm not sure though if it's possible to have 
only a single stem indexed, which would be even better in my opinion. When I 
look at how snowball works only one token is indexed, the stem, and that works 
great. Probably there's something I'm missing in how hunspell works.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I wonder if it can be a dictionary issue since I see that 
different words that have the word "fiets" as root don't get the same stems, 
and using the AND operator at query time is a big issue.



  was:
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token. I'm not sure though if it's possible to have 
only a single stem indexed.

When I look at how snowball works, only one token is indexed, the stem, and 
that works great.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I would work out a patch, I'd just need some help 
deciding the name of the option and what the default behaviour should be.


> Hunspell stemmer generates multiple tokens
> --
>
> Key: LUCENE-5057
> URL: https://issues.apache.org/jira/browse/LUCENE-5057
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Luca Cavanna
>
> The hunspell stemmer seems to be generating multiple tokens: the original 
> token plus the available stems.
> It might be a good thing in some cases but it seems to be a different 
> behaviour compared to the other stemmers and causes problems as well. I would 
> rather have an option to decide whether it should output only the available 
> stems, or the stems plus the original token. I'm not sure though if it's 
> possible to have only a single stem indexed, which would be even better in my 
> opinion. When I look at how snowball works only one token is indexed, the 
> stem, and that works great. Probably there's something I'm missing in how 
> hunspell works.
> Here is my issue: I have a query composed of multiple terms, which is 
> analyzed using stemming and a boolean query is generated out of it. All fine 
> when adding all clauses as should (OR operator), but if I add all clauses as 
> must (AND operator), then I can get back only the documents that contain the 
> stem ori

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683344#comment-13683344
 ] 

Christian commented on LUCENE-5058:
---

There is no virus scanner and the lock is not held by explorer but by the java 
process. 
What do you mean with "this is not a problem"? Does this IOException affect the 
functionality of the index in any way?

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5057) Hunspell stemmer generates multiple tokens

2013-06-14 Thread Luca Cavanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-5057:
-

Description: 
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token. I'm not sure though if it's possible to have 
only a single stem indexed.

When I look at how snowball works, only one token is indexed, the stem, and 
that works great.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I would work out a patch, I'd just need some help 
deciding the name of the option and what the default behaviour should be.

  was:
The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I would work out a patch, I'd just need some help 
deciding the name of the option and what the default behaviour should be.


> Hunspell stemmer generates multiple tokens
> --
>
> Key: LUCENE-5057
> URL: https://issues.apache.org/jira/browse/LUCENE-5057
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Luca Cavanna
>
> The hunspell stemmer seems to be generating multiple tokens: the original 
> token plus the available stems.
> It might be a good thing in some cases but it seems to be a different 
> behaviour compared to the other stemmers and causes problems as well. I would 
> rather have an option to decide whether it should output only the available 
> stems, or the stems plus the original token. I'm not sure though if it's 
> possible to have only a single stem indexed.
> When I look at how snowball works, only one token is indexed, the stem, and 
> that works great.
> Here is my issue: I have a query composed of multiple terms, which is 
> analyzed using stemming and a boolean query is generated out of it. All fine 
> when adding all clauses as should (OR operator), but if I add all clauses as 
> must (AND operator), then I can get back only the documents that contain the 
> stem originated by the exactly same original word.
> Example for the dutch language I'm working with: fiets (means bicycle in 
> dutch), its plural is fietsen.
> If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
> "fiets" I get the only "fiets" indexed.
> When I query for "fietsen whatever" I get the following boolean query: 
> field:fiets field:fietsen field:whatever.
> If I apply the AND operator and use must clauses for eac

[jira] [Updated] (LUCENE-5057) Hunspell stemmer generates multiple tokens

2013-06-14 Thread Luca Cavanna (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luca Cavanna updated LUCENE-5057:
-

Summary: Hunspell stemmer generates multiple tokens  (was: Hunspell stemmer 
generates multiple tokens (original + stems))

> Hunspell stemmer generates multiple tokens
> --
>
> Key: LUCENE-5057
> URL: https://issues.apache.org/jira/browse/LUCENE-5057
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 4.3
>Reporter: Luca Cavanna
>
> The hunspell stemmer seems to be generating multiple tokens: the original 
> token plus the available stems.
> It might be a good thing in some cases but it seems to be a different 
> behaviour compared to the other stemmers and causes problems as well. I would 
> rather have an option to decide whether it should output only the available 
> stems, or the stems plus the original token.
> Here is my issue: I have a query composed of multiple terms, which is 
> analyzed using stemming and a boolean query is generated out of it. All fine 
> when adding all clauses as should (OR operator), but if I add all clauses as 
> must (AND operator), then I can get back only the documents that contain the 
> stem originated by the exactly same original word.
> Example for the dutch language I'm working with: fiets (means bicycle in 
> dutch), its plural is fietsen.
> If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
> "fiets" I get the only "fiets" indexed.
> When I query for "fietsen whatever" I get the following boolean query: 
> field:fiets field:fietsen field:whatever.
> If I apply the AND operator and use must clauses for each subquery, then I 
> can only find the documents that originally contained "fietsen", not the ones 
> that originally contained "fiets", which is not really what stemming is about.
> Any thoughts on this? I would work out a patch, I'd just need some help 
> deciding the name of the option and what the default behaviour should be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683322#comment-13683322
 ] 

Christian edited comment on LUCENE-5058 at 6/14/13 1:29 PM:


There is no stacktrace, since the IOException inside Lucene is not thrown. 
Below is the log of the IndexWriter's infostream (The IOExceptions near the end 
of the log).

IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: current segments file is 
"segments_3"; 
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@1c99159
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: load commit "segments_3"
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: now checkpoint "_1(4.3):C1 
_2(4.3):C1" [2 segments ; isCommit = false]
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: 0 msec to checkpoint
IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: create=false
IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: 
dir=org.apache.lucene.store.SimpleFSDirectory@D:\CAM_server\test 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@74c3aa
index=_1(4.3):C1 _2(4.3):C1
version=4.3.0 1477023 - simonw - 2013-04-29 14:50:23
matchVersion=LUCENE_43
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
ramBufferSizeMB=16.0
maxBufferedDocs=-1
maxBufferedDeleteTerms=-1
mergedSegmentWarmer=null
readerTermsIndexDivisor=1
termIndexInterval=32
delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
commit=null
openMode=CREATE_OR_APPEND
similarity=org.apache.lucene.search.similarities.DefaultSimilarity
mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=1, maxMergeCount=2, 
mergeThreadPriority=-1
default WRITE_LOCK_TIMEOUT=1000
writeLockTimeout=1000
codec=Lucene42
infoStream=org.apache.lucene.util.PrintStreamInfoStream
mergePolicy=[TieredMergePolicy: maxMergeAtOnce=10, maxMergeAtOnceExplicit=30, 
maxMergedSegmentMB=5120.0, floorSegmentMB=2.0, 
forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0, useCompoundFile=true, 
maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1
indexerThreadPool=org.apache.lucene.index.ThreadAffinityDocumentsWriterThreadPool@1d9fd51
readerPooling=false
perThreadHardLimitMB=1945

IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: flush at getReader
DW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: main startFullFlush
DW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: anyChanges? numDocsInRam=1 
deletes=true hasTickets:false pendingChangesInFullFlush: false
DWFC 0 [Fri Jun 14 14:49:24 CEST 2013; main]: addFlushableState 
DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_3, aborting=false, 
numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 0 ]]
DWPT 0 [Fri Jun 14 14:49:24 CEST 2013; main]: flush postings as segment _3 
numDocs=1
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: new segment has 0 deleted docs
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: new segment has no vectors; no 
norms; no docValues; no prox; no freqs
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushedFiles=[_3.fdx, 
_3_Lucene41_0.doc, _3_Lucene41_0.tip, _3.fnm, _3_Lucene41_0.tim, _3.fdt]
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushed codec=Lucene42
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushed: segment=_3 ramUsed=0.063 
MB newFlushedSize(includes docstores)=0.001 MB docs/MB=1,833.175
DW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publishFlushedSegment seg-private 
deletes=null
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publishFlushedSegment
BD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: push deletes  1 deleted terms 
(unique count=1) bytesUsed=1024 delGen=1 packetCount=1 totBytesUsed=1024
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publish sets newSegment delGen=2 
seg=_3(4.3):C1
IFD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: now checkpoint "_1(4.3):C1 
_2(4.3):C1 _3(4.3):C1" [3 segments ; isCommit = false]
IFD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: 0 msec to checkpoint
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: don't apply deletes now 
delTermCount=1 bytesUsed=1024
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: return reader version=10 
reader=StandardDirectoryReader(segments_3:10:nrt _1(4.3):C1 _2(4.3):C1 
_3(4.3):C1)
DW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: main finishFullFlush success=true
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]: findMerges: 3 segments
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_1(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_2(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_3(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   allowedSegmentCount=1 vs count=3 
(eligible count=3) tooBigCount=0
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]: now merge
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   index: _1(4.3):C1 _2(4.3):C1 
_3(4.3):C1
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   no more merges pending; now 
return
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: getReader took 194 msec
Docume

[jira] [Updated] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-4921:
--

Attachment: SOLR-4921.patch

Can add JSON or XML now, a few other tweaks.

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch, SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683326#comment-13683326
 ] 

Simon Willnauer commented on LUCENE-5058:
-

can you please close the issue, this is not a problem there might be a virus 
scanner or your explorer still holding handles to these files. this is expected 
under window.

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683322#comment-13683322
 ] 

Christian commented on LUCENE-5058:
---

There is no stacktrace, since the IOException inside Lucene is not thrown. 
Below is the log of the IndexWriter's infostream (The IOExceptions near the end 
of the log).

IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: current segments file is 
"segments_3"; 
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@1c99159
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: load commit "segments_3"
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: now checkpoint "_1(4.3):C1 
_2(4.3):C1" [2 segments ; isCommit = false]
IFD 0 [Fri Jun 14 14:49:24 CEST 2013; main]: 0 msec to checkpoint
IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: init: create=false
IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: 
dir=org.apache.lucene.store.SimpleFSDirectory@D:\Workspace_Trunk_Neu\CAM\CAM_server\test
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@74c3aa
index=_1(4.3):C1 _2(4.3):C1
version=4.3.0 1477023 - simonw - 2013-04-29 14:50:23
matchVersion=LUCENE_43
analyzer=org.apache.lucene.analysis.standard.StandardAnalyzer
ramBufferSizeMB=16.0
maxBufferedDocs=-1
maxBufferedDeleteTerms=-1
mergedSegmentWarmer=null
readerTermsIndexDivisor=1
termIndexInterval=32
delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
commit=null
openMode=CREATE_OR_APPEND
similarity=org.apache.lucene.search.similarities.DefaultSimilarity
mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=1, maxMergeCount=2, 
mergeThreadPriority=-1
default WRITE_LOCK_TIMEOUT=1000
writeLockTimeout=1000
codec=Lucene42
infoStream=org.apache.lucene.util.PrintStreamInfoStream
mergePolicy=[TieredMergePolicy: maxMergeAtOnce=10, maxMergeAtOnceExplicit=30, 
maxMergedSegmentMB=5120.0, floorSegmentMB=2.0, 
forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0, useCompoundFile=true, 
maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1
indexerThreadPool=org.apache.lucene.index.ThreadAffinityDocumentsWriterThreadPool@1d9fd51
readerPooling=false
perThreadHardLimitMB=1945

IW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: flush at getReader
DW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: main startFullFlush
DW 0 [Fri Jun 14 14:49:24 CEST 2013; main]: anyChanges? numDocsInRam=1 
deletes=true hasTickets:false pendingChangesInFullFlush: false
DWFC 0 [Fri Jun 14 14:49:24 CEST 2013; main]: addFlushableState 
DocumentsWriterPerThread [pendingDeletes=gen=0, segment=_3, aborting=false, 
numDocsInRAM=1, deleteQueue=DWDQ: [ generation: 0 ]]
DWPT 0 [Fri Jun 14 14:49:24 CEST 2013; main]: flush postings as segment _3 
numDocs=1
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: new segment has 0 deleted docs
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: new segment has no vectors; no 
norms; no docValues; no prox; no freqs
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushedFiles=[_3.fdx, 
_3_Lucene41_0.doc, _3_Lucene41_0.tip, _3.fnm, _3_Lucene41_0.tim, _3.fdt]
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushed codec=Lucene42
DWPT 0 [Fri Jun 14 14:49:25 CEST 2013; main]: flushed: segment=_3 ramUsed=0.063 
MB newFlushedSize(includes docstores)=0.001 MB docs/MB=1,833.175
DW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publishFlushedSegment seg-private 
deletes=null
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publishFlushedSegment
BD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: push deletes  1 deleted terms 
(unique count=1) bytesUsed=1024 delGen=1 packetCount=1 totBytesUsed=1024
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: publish sets newSegment delGen=2 
seg=_3(4.3):C1
IFD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: now checkpoint "_1(4.3):C1 
_2(4.3):C1 _3(4.3):C1" [3 segments ; isCommit = false]
IFD 0 [Fri Jun 14 14:49:25 CEST 2013; main]: 0 msec to checkpoint
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: don't apply deletes now 
delTermCount=1 bytesUsed=1024
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: return reader version=10 
reader=StandardDirectoryReader(segments_3:10:nrt _1(4.3):C1 _2(4.3):C1 
_3(4.3):C1)
DW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: main finishFullFlush success=true
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]: findMerges: 3 segments
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_1(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_2(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   seg=_3(4.3):C1 size=0.000 MB 
[floored]
TMP 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   allowedSegmentCount=1 vs count=3 
(eligible count=3) tooBigCount=0
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]: now merge
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   index: _1(4.3):C1 _2(4.3):C1 
_3(4.3):C1
CMS 0 [Fri Jun 14 14:49:25 CEST 2013; main]:   no more merges pending; now 
return
IW 0 [Fri Jun 14 14:49:25 CEST 2013; main]: getReader took 194 msec
Document>
IW 0 [Fri Jun 14 14:4

[jira] [Commented] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683314#comment-13683314
 ] 

Uwe Schindler commented on LUCENE-5058:
---

Could you please post the stack trace?

> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encounters an IOException stating that the 
> index files could not be removed. I have investigated with Sysinternals and 
> it says the file is still locked despite the fact that the index searcher is 
> correctly closed. If i call deleteAll() without opening a searcher before it 
> just works fine as expected. This seems to be a bug in Lucene, since closing 
> the index searcher makes it impossible to delete the index.
> Here is the source code:
> {code:title=Bar.java|borderStyle=solid}
> public class LuceneTest {
> private Directory dir;
> private IndexWriter writer;
> 
> public void addDocs(long value) throws IOException {
> Document doc = new Document();
> doc.add(new LongField("ID", value, Field.Store.YES));
> writer.deleteDocuments(new Term("ID", "1"));
> writer.addDocument(doc);
> }
> 
> public void search() throws IOException {
>   IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
> false));
> TopDocs results = 
> searcher.search(NumericRangeQuery.newLongRange("ID", 1L, 2L, true, true), 1);
> 
> for ( ScoreDoc sc : results.scoreDocs) {
> System.out.println(searcher.doc(sc.doc));
> }
> 
> searcher.getIndexReader().close();
> }
> public static void main(String[] args) throws IOException {
> new LuceneTest();
> }
> 
> public LuceneTest() throws IOException {
> dir = FSDirectory.open(new File("test"));
> IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, 
> new StandardAnalyzer(Version.LUCENE_43));
> config.setInfoStream(System.out);
> writer = new IndexWriter(dir, config);
> 
> addDocs(1L); 
> search();
> //writer.commit(); -- If i call commit after search, then no 
> IOException occurrs!
> 
> writer.deleteAll();
> }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683311#comment-13683311
 ] 

Grant Ingersoll edited comment on SOLR-4921 at 6/14/13 12:31 PM:
-

Also, still need to handle the command parameters like commitWithin, etc.  
Also, handling incremental updates, etc. would be nice

  was (Author: gsingers):
Also, still need to handle the command parameters like commitWithin, etc.
  
> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian updated LUCENE-5058:
--

Description: 
I have a small test prog which inserts some data in an index and after that, 
opens a searcher from the uncommitted index to search on and output the result 
to std.out. The Searcher is immediately closed. Then, i call deleteAll() on the 
index but it encounters an IOException stating that the index files could not 
be removed. I have investigated with Sysinternals and it says the file is still 
locked despite the fact that the index searcher is correctly closed. If i call 
deleteAll() without opening a searcher before it just works fine as expected. 
This seems to be a bug in Lucene, since closing the index searcher makes it 
impossible to delete the index.

Here is the source code:

{code:title=Bar.java|borderStyle=solid}
public class LuceneTest {

private Directory dir;
private IndexWriter writer;

public void addDocs(long value) throws IOException {
Document doc = new Document();
doc.add(new LongField("ID", value, Field.Store.YES));
writer.deleteDocuments(new Term("ID", "1"));
writer.addDocument(doc);
}

public void search() throws IOException {
  IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
false));
TopDocs results = searcher.search(NumericRangeQuery.newLongRange("ID", 
1L, 2L, true, true), 1);

for ( ScoreDoc sc : results.scoreDocs) {
System.out.println(searcher.doc(sc.doc));
}

searcher.getIndexReader().close();
}

public static void main(String[] args) throws IOException {
new LuceneTest();
}

public LuceneTest() throws IOException {
dir = FSDirectory.open(new File("test"));
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new 
StandardAnalyzer(Version.LUCENE_43));
config.setInfoStream(System.out);
writer = new IndexWriter(dir, config);

addDocs(1L); 
search();
//writer.commit(); -- If i call commit after search, then no 
IOException occurrs!

writer.deleteAll();
}
}
{code}

  was:
I have a small test prog which inserts some data in an index and after that, 
opens a searcher from the uncommitted index to search on and output the result 
to std.out. The Searcher is immediately closed. Then, i call deleteAll() on the 
index but it encounters an IOException stating that the index files could not 
be removed. I have investigated with Sysinternals and it says the file is still 
locked despite the fact that the index searcher is correctly closed. If i call 
deleteAll() without opening a searcher before it just works fine as expected. 
This seems to be a bug in Lucene, since closing the index searcher makes it 
impossible to delete the index.

Here is the source code:


public class LuceneTest {

private Directory dir;
private IndexWriter writer;

public void addDocs(long value) throws IOException {
Document doc = new Document();
doc.add(new LongField("ID", value, Field.Store.YES));
writer.deleteDocuments(new Term("ID", "1"));
writer.addDocument(doc);
}

public void search() throws IOException {
  IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
false));
TopDocs results = searcher.search(NumericRangeQuery.newLongRange("ID", 
1L, 2L, true, true), 1);

for ( ScoreDoc sc : results.scoreDocs) {
System.out.println(searcher.doc(sc.doc));
}

searcher.getIndexReader().close();
}

public static void main(String[] args) throws IOException {
new LuceneTest();
}

public LuceneTest() throws IOException {
dir = FSDirectory.open(new File("test"));
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new 
StandardAnalyzer(Version.LUCENE_43));
config.setInfoStream(System.out);
writer = new IndexWriter(dir, config);

addDocs(1L); 
search();
//writer.commit(); -- If i call commit after search, then no 
IOException occurrs!

writer.deleteAll();
}
}



> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i c

[jira] [Updated] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian updated LUCENE-5058:
--

Description: 
I have a small test prog which inserts some data in an index and after that, 
opens a searcher from the uncommitted index to search on and output the result 
to std.out. The Searcher is immediately closed. Then, i call deleteAll() on the 
index but it encounters an IOException stating that the index files could not 
be removed. I have investigated with Sysinternals and it says the file is still 
locked despite the fact that the index searcher is correctly closed. If i call 
deleteAll() without opening a searcher before it just works fine as expected. 
This seems to be a bug in Lucene, since closing the index searcher makes it 
impossible to delete the index.

Here is the source code:


public class LuceneTest {

private Directory dir;
private IndexWriter writer;

public void addDocs(long value) throws IOException {
Document doc = new Document();
doc.add(new LongField("ID", value, Field.Store.YES));
writer.deleteDocuments(new Term("ID", "1"));
writer.addDocument(doc);
}

public void search() throws IOException {
  IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
false));
TopDocs results = searcher.search(NumericRangeQuery.newLongRange("ID", 
1L, 2L, true, true), 1);

for ( ScoreDoc sc : results.scoreDocs) {
System.out.println(searcher.doc(sc.doc));
}

searcher.getIndexReader().close();
}

public static void main(String[] args) throws IOException {
new LuceneTest();
}

public LuceneTest() throws IOException {
dir = FSDirectory.open(new File("test"));
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new 
StandardAnalyzer(Version.LUCENE_43));
config.setInfoStream(System.out);
writer = new IndexWriter(dir, config);

addDocs(1L); 
search();
//writer.commit(); -- If i call commit after search, then no 
IOException occurrs!

writer.deleteAll();
}
}


  was:
I have a small test prog which inserts some data in an index and after that, 
opens a searcher from the uncommitted index to search on and output the result 
to std.out. The Searcher is immediately closed. Then, i call deleteAll() on the 
index but it encounters an IOException stating that the index files could not 
be removed. I have investigated with Sysinternals and it says the file is still 
locked despite the fact that the index searcher is correctly closed. If i call 
deleteAll() without opening a searcher before it just works fine as expected. 
This seems to be a bug in Lucene, since closing the index searcher makes it 
impossible to delete the index.

Here is the source code:


public class LuceneTest {

private Directory dir;
private IndexWriter writer;

public void addDocs(long value) throws IOException {
Document doc = new Document();
doc.add(new LongField("ID", value, Field.Store.YES));
writer.deleteDocuments(new Term("ID", "1"));
writer.addDocument(doc);
}

public void search() throws IOException {
  IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
false));
TopDocs results = searcher.search(NumericRangeQuery.newLongRange("ID", 
1L, 2L, true, true), 1);

for ( ScoreDoc sc : results.scoreDocs) {
System.out.println(searcher.doc(sc.doc));
}

searcher.getIndexReader().close();
}

public static void main(String[] args) throws IOException {
new LuceneTest();
}

public LuceneTest() throws IOException {
dir = FSDirectory.open(new File("test"));
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new 
StandardAnalyzer(Version.LUCENE_43));
config.setInfoStream(System.out);
writer = new IndexWriter(dir, config);

addDocs(1L); 
search();
//writer.commit(); -- If i call commit after search, then no 
IOException occurrs!

writer.deleteAll();
}
}



> IOException when trying to delete data from index
> -
>
> Key: LUCENE-5058
> URL: https://issues.apache.org/jira/browse/LUCENE-5058
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.3
> Environment: Windows 7
>Reporter: Christian
>
> I have a small test prog which inserts some data in an index and after that, 
> opens a searcher from the uncommitted index to search on and output the 
> result to std.out. The Searcher is immediately closed. Then, i call 
> deleteAll() on the index but it encoun

[jira] [Created] (LUCENE-5058) IOException when trying to delete data from index

2013-06-14 Thread Christian (JIRA)

Christian created LUCENE-5058:
-

 Summary: IOException when trying to delete data from index
 Key: LUCENE-5058
 URL: https://issues.apache.org/jira/browse/LUCENE-5058
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.3
 Environment: Windows 7
Reporter: Christian


I have a small test prog which inserts some data in an index and after that, 
opens a searcher from the uncommitted index to search on and output the result 
to std.out. The Searcher is immediately closed. Then, i call deleteAll() on the 
index but it encounters an IOException stating that the index files could not 
be removed. I have investigated with Sysinternals and it says the file is still 
locked despite the fact that the index searcher is correctly closed. If i call 
deleteAll() without opening a searcher before it just works fine as expected. 
This seems to be a bug in Lucene, since closing the index searcher makes it 
impossible to delete the index.

Here is the source code:


public class LuceneTest {

private Directory dir;
private IndexWriter writer;

public void addDocs(long value) throws IOException {
Document doc = new Document();
doc.add(new LongField("ID", value, Field.Store.YES));
writer.deleteDocuments(new Term("ID", "1"));
writer.addDocument(doc);
}

public void search() throws IOException {
  IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(writer, 
false));
TopDocs results = searcher.search(NumericRangeQuery.newLongRange("ID", 
1L, 2L, true, true), 1);

for ( ScoreDoc sc : results.scoreDocs) {
System.out.println(searcher.doc(sc.doc));
}

searcher.getIndexReader().close();
}

public static void main(String[] args) throws IOException {
new LuceneTest();
}

public LuceneTest() throws IOException {
dir = FSDirectory.open(new File("test"));
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, new 
StandardAnalyzer(Version.LUCENE_43));
config.setInfoStream(System.out);
writer = new IndexWriter(dir, config);

addDocs(1L); 
search();
//writer.commit(); -- If i call commit after search, then no 
IOException occurrs!

writer.deleteAll();
}
}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

LUCENE-2145, How to proceed on Tokenizer.close()

2013-06-14 Thread Benson Margulies

Since LUCENE-2145 is of specific practical interest to me, I'd like to pitch in.

However, it raises a bit of a compatibility thicket.

There are, I think, two rough approaches:

1) redefine close() on TokenStream to have the less surprising
semantics of 'don't use this object after it has been closed', and
change existing callers to close the reader for themselves, or provide
a closeReader().

2) leave close() as is, and define, well, 'reallyCloseIMeanIt()'.

I'm happy to do grunt work here, but I'm obviously not the person to
decide which of these is appropriate and tasteful.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683311#comment-13683311
 ] 

Grant Ingersoll commented on SOLR-4921:
---

Also, still need to handle the command parameters like commitWithin, etc.

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Potential Wiki text on the lifecycle of an analysis component

2013-06-14 Thread Benson Margulies

I'd like to post some documentation to help other people trying to
deal with thread-safety and lifetime issues on analysis components.

Here is what I think I know, based on corrections here I'll post something.

Each Solr core has a schema. By default, Solr create a schema when it
creates a core. If, however, shared schemas are enabled, then Solr
maintains a map from schema names to schema, and cores that declare
the same schema (via the name attribute in the schema XML file) share
the schema object.

The schema declares a set of field types. Each field type is
represented by an object of some class that inherits from
org.apache.solr.schema.FieldType.

This class optionally stored two analyzers: the 'analyzer' for
indexing, and the queryAnalyzer for queries.

If a field type is declared with an  element that has no
class name attribute, Solr creates an analyzer of type
org.apache.solr.analysis.TokenizerChain. These objects store a
TokenizerFactory, a list of TokenFilterFactories, and a list of
CharFilterFactories. They deliver, upon request, a java.io.Reader
build from the char filters or a TokenStreamComponents object
containing a new tokenizer and filter set.

Solr typically runs in a multi-threaded servlet container, so each
Solr request runs in the the container thread that handled the HTTP
request. For an update request, DocInverterPerField will call
Field.tokenStream to get a new token stream. It calls close() on that
field when it is done (c.f. LUCENE-2145, which notes that this only
closes the internal reader). So there is a new set of analysis
components for each field for each request.

For a query, the analysis components are, not too surprisingly,
created by the query parser, since it is the query parser that must
split any relevant strings into their constituent elements.

To summarize, then, here is the typical situation.

The core has a schema. This lives for the length of the core, or in
the shared case, the core container.

The schema has field types.

Each field type has two analyzers. All of this, so far, has the
lifetime of the schema.

At update time, the analyzer is called upon to create tokenization
components with the lifetime of processing a single document.

At query time, the query analyzer is called upon to create
tokenization components with the lifetime of processing one field of
the query.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-4921:
--

Attachment: SOLR-4921.patch

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-4921:
--

Attachment: SOLR-4921.patch

Actually posts the documents now.

Things to do:
# Need an icon for the documents tab
# Success/Error handling
# Test other content types
# File upload
# Document creation wizard
# Better layout of the form, etc.
# Other things I'm sure I forgot.

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4921) Support for Adding Documents via the Solr UI

2013-06-14 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-4921:
--

Attachment: (was: SOLR-4921.patch)

> Support for Adding Documents via the Solr UI
> 
>
> Key: SOLR-4921
> URL: https://issues.apache.org/jira/browse/SOLR-4921
> Project: Solr
>  Issue Type: New Feature
>  Components: web gui
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4921.patch, SOLR-4921.patch, SOLR-4921.patch, 
> SOLR-4921.patch
>
>
> For demos and prototyping, it would be nice if we could add documents via the 
> admin UI.
> Various things to support:
> 1. Uploading XML, JSON, CSV, etc.
> 2. Optionally also do file upload

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2082) Performance improvement for merging posting lists

2013-06-14 Thread Dmitry Kan (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683303#comment-13683303
 ] 

Dmitry Kan commented on LUCENE-2082:


hi [~whzz],

Would you be potentially interested in other postings lists idea that came up 
recently?

http://markmail.org/message/6ro7bbez3v3y5mfx#query:+page:1+mid:tywtrjjcfdbzww6f+state:results

It can be of quite high impact on the index size and hopefully relatively easy 
to start an experiment using the lucene codec technology.

Just in case you would get interested.

> Performance improvement for merging posting lists
> -
>
> Key: LUCENE-2082
> URL: https://issues.apache.org/jira/browse/LUCENE-2082
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Reporter: Michael Busch
>Priority: Minor
>  Labels: gsoc2013
> Fix For: 4.4
>
>
> A while ago I had an idea about how to improve the merge performance
> for posting lists. This is currently by far the most expensive part of
> segment merging due to all the VInt de-/encoding. Not sure if an idea
> for improving this was already mentioned in the past?
> So the basic idea is it to perform a raw copy of as much posting data
> as possible. The reason why this is difficult is that we have to
> remove deleted documents. But often the fraction of deleted docs in a
> segment is rather low (<10%?), so it's likely that there are quite
> long consecutive sections without any deletions.
> To find these sections we could use the skip lists. Basically at any
> point during the merge we would find the skip entry before the next
> deleted doc. All entries to this point can be copied without
> de-/encoding of the VInts. Then for the section that has deleted docs
> we perform the "normal" way of merging to remove the deletes. Then we
> check again with the skip lists if we can raw copy the next section.
> To make this work there are a few different necessary changes:
> 1) Currently the multilevel skiplist reader/writer can only deal with 
> fixed-size
> skips (16 on the lowest level). It would be an easy change to allow
> variable-size skips, but then the MultiLevelSkipListReader can't
> return numSkippedDocs anymore, which SegmentTermDocs needs -> change 2)
> 2) Store the last docID in which a term occurred in the term
> dictionary. This would also be beneficial for other use cases. By
> doing that the SegmentTermDocs#next(), #read() and #skipTo() know when
> the end of the postinglist is reached. Currently they have to track
> the df, which is why after a skip it's important to take the
> numSkippedDocs into account.
> 3) Change the merging algorithm according to my description above. It's
> important to create a new skiplist entry at the beginning of every
> block that is copied in raw mode, because its next skip entry's values
> are deltas from the beginning of the block. Also the very first posting, and
> that one only, needs to be decoded/encoded to make sure that the
> payload length is explicitly written (i.e. must not depend on the
> previous length). Also such a skip entry has to be created at the
> beginning of each source segment's posting list. With change 2) we don't
> have to worry about the positions of the skip entries. And having a few
> extra skip entries in merged segments won't hurt much.
> If a segment has no deletions at all this will avoid any
> decoding/encoding of VInts (best case). I think it will also work
> great for segments with a rather low amount of deletions. We should
> probably then have a threshold: if the number of deletes exceeds this
> threshold we should fall back to old style merging.
> I haven't implemented any of this, so there might be complications I
> haven't thought about. Please let me know if you can think of reasons
> why this wouldn't work or if you think more changes are necessary.
> I will probably not have time to work on this soon, but I wanted to
> open this issue to not forget about it :). Anyone should feel free to
> take this!
> Btw: I think the flex-indexing branch would be a great place to try this
> out as a new codec. This would also be good to figure out what APIs
> are needed to make merging fully flexible as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Doubt In Apache Solr 4.3.0

2013-06-14 Thread Dmitry Kan

Hi,
You should post your questions of solr / lucene usage sort to the user
list, not here. It is developers list.

http://lucene.apache.org/solr/discussion.html#solr-user-list-solr-userlucene


2013/6/14 vignesh 

>  Hi Team,
>
>  
>
> I am Vignesh, now working in Apache Solr 4.3.0  have
> indexed data and am able to search using the query .
>
> How to carry out Fuzzy Search in Solr 4.3.0 can you guide me through this
> process.
>
> ** **
>
>
>
> *Thanks & Regards.*
>
> *Vignesh.V*
>
> * *
>
> *[image: cid:image001.jpg@01CA4872.39B33D40]*
>
> Ninestars Information Technologies Limited.,
>
> 72, Greams Road, Thousand Lights, Chennai - 600 006. India.
>
> Landline : +91 44 2829 4226 / 36 / 56   X: 144
>
> www.ninestars.in 
>
> ** **
>
> --
> STOP Virus, STOP SPAM, SAVE Bandwidth!
> www.safentrix.com 
> --
>
>
<>

[jira] [Commented] (LUCENE-5038) Don't call MergePolicy / IndexWriter during DWPT Flush

2013-06-14 Thread Commit Tag Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683292#comment-13683292
 ] 

Commit Tag Bot commented on LUCENE-5038:


[trunk commit] simonw
http://svn.apache.org/viewvc?view=revision&revision=1493022

LUCENE-5038: Fix test to reliably work with all codecs / posting formats

> Don't call MergePolicy / IndexWriter during DWPT Flush
> --
>
> Key: LUCENE-5038
> URL: https://issues.apache.org/jira/browse/LUCENE-5038
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 4.3
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
> Fix For: 4.4
>
> Attachments: LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch, 
> LUCENE-5038.patch, LUCENE-5038.patch, LUCENE-5038.patch
>
>
> We currently consult the indexwriter -> merge policy to decide if we need to 
> write CFS or not which is bad in many ways.
> - we should call mergepolicy only during merges
> - we should never sync on IW during DWPT flush
> - we should be able to make the decision if we need to write CFS or not 
> before flush, ie. we could write parts of the flush directly to CFS or even 
> start writing stored fields directly.
> - in the NRT case it might make sense to write all flushes to CFS to minimize 
> filedescriptors independent of the index size.
> I wonder if we can use a simple boolean for this in the IWC and get away with 
> not consulting merge policy. This would simplify concurrency a lot here 
> already.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4916) Add support to write and read Solr index files and transaction log files to and from HDFS.

2013-06-14 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683270#comment-13683270
 ] 

Mark Miller commented on SOLR-4916:
---

Thanks for taking a look AB!

bq. Re. Hadoop dependencies: the patch adds a hard dependency on Hadoop and its 
dependencies directly to Solr core. I wonder if it's possible to refactor it so 
that it could be optional and the functionality itself moved to contrib/ - this 
way only users who want to use HdfsDirectory would need Hadoop deps.

Yeah, I don't really beleive in Solr contribs - they are not so useful IMO - 
it's a pain to actually pull them out and it has to be done after the fact. 
Given the size of the dependencies is such a small percentage of the current 
size, that we don't want to support the UpdateLog as actually pluggable, and 
that it would be nice that hdfs was supported out of the box just as local 
filesystem, I don't see being a contrib being much of a win. It saves a few 
megabytes when we are already well over 100 - and that's if you are willing to 
pull it apart after you download it. From what I've seen, even with the *huge* 
extract contrib, most people don't bother repackaging. It's hard to imagine 
they would for a few megabytes. 

bq. Cache and BlockCache imple

We have done some casual benchmarking - loading tweets at a high rate of speed 
while sending queries at a high rate of speed with 1 second NRT - essentially 
the worst case NRT scenerio. By and large, performance has been similiar to 
local filesystem performance. We will likely share some numbers when we have 
some less casual results. You do of course have to warm up the block cache 
before it really kicks in.

In terms of impl, as I mentioned, the orig HdfsDirectory comes from the Blur 
guys - we tried not to change it too much currently - not until we figure out 
if we might evolve it with them in the future - eg as a Lucene module or 
something. 

> Add support to write and read Solr index files and transaction log files to 
> and from HDFS.
> --
>
> Key: SOLR-4916
> URL: https://issues.apache.org/jira/browse/SOLR-4916
> Project: Solr
>  Issue Type: New Feature
>Reporter: Mark Miller
>Assignee: Mark Miller
> Attachments: SOLR-4916.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5057) Hunspell stemmer generates multiple tokens (original + stems)

2013-06-14 Thread Luca Cavanna (JIRA)

Luca Cavanna created LUCENE-5057:


 Summary: Hunspell stemmer generates multiple tokens (original + 
stems)
 Key: LUCENE-5057
 URL: https://issues.apache.org/jira/browse/LUCENE-5057
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.3
Reporter: Luca Cavanna


The hunspell stemmer seems to be generating multiple tokens: the original token 
plus the available stems.

It might be a good thing in some cases but it seems to be a different behaviour 
compared to the other stemmers and causes problems as well. I would rather have 
an option to decide whether it should output only the available stems, or the 
stems plus the original token.

Here is my issue: I have a query composed of multiple terms, which is analyzed 
using stemming and a boolean query is generated out of it. All fine when adding 
all clauses as should (OR operator), but if I add all clauses as must (AND 
operator), then I can get back only the documents that contain the stem 
originated by the exactly same original word.

Example for the dutch language I'm working with: fiets (means bicycle in 
dutch), its plural is fietsen.

If I index "fietsen" I get both "fietsen" and "fiets" indexed, but if I index 
"fiets" I get the only "fiets" indexed.

When I query for "fietsen whatever" I get the following boolean query: 
field:fiets field:fietsen field:whatever.

If I apply the AND operator and use must clauses for each subquery, then I can 
only find the documents that originally contained "fietsen", not the ones that 
originally contained "fiets", which is not really what stemming is about.

Any thoughts on this? I would work out a patch, I'd just need some help 
deciding the name of the option and what the default behaviour should be.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4916) Add support to write and read Solr index files and transaction log files to and from HDFS.

2013-06-14 Thread Andrzej Bialecki (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683219#comment-13683219
 ] 

Andrzej Bialecki  commented on SOLR-4916:
-

Mark, this functionality looks very cool!

Re. Hadoop dependencies: the patch adds a hard dependency on Hadoop and its 
dependencies directly to Solr core. I wonder if it's possible to refactor it so 
that it could be optional and the functionality itself moved to contrib/ - this 
way only users who want to use HdfsDirectory would need Hadoop deps.

Re. Cache and BlockCache implementation - I did something similar in Luke's 
FsDirectory, where I decided to use Ehcache, although that implementation was 
read-only so it was much simpler. Performance improvements for repeated 
searches were of course dramatic, not so much for unique queries though. Do you 
have some preliminary benchmarks for this implementation, how much slower is 
the indexing / searching? Anyway, doing an Ehcache-based implementation of 
Cache with your patch seems straightforward, too.

There's very little javadoc / package docs for the new public classes and 
packages.

What are HdfsDirectory.LF_EXT and getNormalNames() for?

> Add support to write and read Solr index files and transaction log files to 
> and from HDFS.
> --
>
> Key: SOLR-4916
> URL: https://issues.apache.org/jira/browse/SOLR-4916
> Project: Solr
>  Issue Type: New Feature
>Reporter: Mark Miller
>Assignee: Mark Miller
> Attachments: SOLR-4916.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

79 matches

Mail list logo