date:20110415

[
https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020206#comment-13020206
]

Simon Willnauer commented on LUCENE-2571:
-

bq. Would you consider trying other MergePolicy objects on trunk? The
BalancedSegment MP tries to avoid these long stoppages.

I think there is a misunderstanding on your side. The long stoppages on trunk
are not due to merges at all. They are due to flushing the DocumentsWriter
which essentially means stop the world. This is why we can not make any
progress. Merges are NOT blocking indexing on trunk no matter which MP you use.
The Balanced MP is rather suited for RT environments to make reopening the
reader quicker.

you should maybe look at this blog entry for a more complete explanation:
http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/

Indexing performance tests with realtime branch
---

Key: LUCENE-2571
URL: https://issues.apache.org/jira/browse/LUCENE-2571
Project: Lucene - Java
Issue Type: Task
Components: Index
Reporter: Michael Busch
Priority: Minor
Fix For: Realtime Branch

Attachments: wikimedium.realtime.Standard.nd10M_dps.png,
wikimedium.realtime.Standard.nd10M_dps_addDocuments.png,
wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png,
wikimedium.trunk.Standard.nd10M_dps.png,
wikimedium.trunk.Standard.nd10M_dps_addDocuments.png

We should run indexing performance tests with the DWPT changes and compare to
trunk.
We need to test both single-threaded and multi-threaded performance.
NOTE: flush by RAM isn't implemented just yet, so either we wait with the
tests or flush by doc count.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch

2011-04-15 Thread Earwin Burrfoot (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020217#comment-13020217
]

Earwin Burrfoot commented on LUCENE-2571:
-

bq. Merges are NOT blocking indexing on trunk no matter which MP you use.
Well.. merges tie up IO (especially if not on fancy SSDs/RAIDs), which in turn
lags flushes - bigger delays for stop the world flushes / lower bandwith cap
(after which they are forced to stop the world) for parallel flushes.

So Lance's point is partially valid.

Indexing performance tests with realtime branch
---

Key: LUCENE-2571
URL: https://issues.apache.org/jira/browse/LUCENE-2571
Project: Lucene - Java
Issue Type: Task
Components: Index
Reporter: Michael Busch
Priority: Minor
Fix For: Realtime Branch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3028) IW.getReader() returns inconsistent reader on RT Branch


[ 
https://issues.apache.org/jira/browse/LUCENE-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020222#comment-13020222
 ] 

selckin commented on LUCENE-3028:
-

hasn't failed since above  fix

 IW.getReader() returns inconsistent reader on RT Branch
 ---

 Key: LUCENE-3028
 URL: https://issues.apache.org/jira/browse/LUCENE-3028
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3028.patch, LUCENE-3028.patch, realtime-1.txt


 I extended the testcase TestRollingUpdates#testUpdateSameDoc to pull a NRT 
 reader after each update and asserted that is always sees only one document. 
 Yet, this fails with current branch since there is a problem in how we flush 
 in the getReader() case. What happens here is that we flush all threads and 
 then release the lock (letting other flushes which came in after we entered 
 the flushAllThread context, continue) so that we could concurrently get a new 
 segment that transports global deletes without the corresponding add. They 
 sneak in while we continue to open the NRT reader which in turn sees 
 inconsistent results.
 I will upload a patch soon

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3028) IW.getReader() returns inconsistent reader on RT Branch


[ 
https://issues.apache.org/jira/browse/LUCENE-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020224#comment-13020224
 ] 

Simon Willnauer commented on LUCENE-3028:
-

bq. hasn't failed since above fix

thanks for reporting back, the failure you reported was due to a reset call at 
the wrong position. I was allowing blocked flushed to continue before I reset 
the the var that ensures that the blocked flushes continue before a full flush 
finished.


 IW.getReader() returns inconsistent reader on RT Branch
 ---

 Key: LUCENE-3028
 URL: https://issues.apache.org/jira/browse/LUCENE-3028
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3028.patch, LUCENE-3028.patch, realtime-1.txt


 I extended the testcase TestRollingUpdates#testUpdateSameDoc to pull a NRT 
 reader after each update and asserted that is always sees only one document. 
 Yet, this fails with current branch since there is a problem in how we flush 
 in the getReader() case. What happens here is that we flush all threads and 
 then release the lock (letting other flushes which came in after we entered 
 the flushAllThread context, continue) so that we could concurrently get a new 
 segment that transports global deletes without the corresponding add. They 
 sneak in while we continue to open the NRT reader which in turn sees 
 inconsistent results.
 I will upload a patch soon

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestAddIndexes-3.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestOmitTf-corrupt-0.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestIndexWriterExceptions-npe-2.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestIndexWriterExceptions-assert-6.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestAddIndexes-5.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestIndexWriterExceptions-npe-1.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3023) Land DWPT on trunk


 [ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

selckin updated LUCENE-3023:


Attachment: realtime-TestIndexWriterExceptions-npe-4.txt

 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2571) Indexing performance tests with realtime branch


 [ 
https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2571:


Attachment: 
wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png

 Indexing performance tests with realtime branch
 ---

 Key: LUCENE-2571
 URL: https://issues.apache.org/jira/browse/LUCENE-2571
 Project: Lucene - Java
  Issue Type: Task
  Components: Index
Reporter: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: wikimedium.realtime.Standard.nd10M_dps.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, 
 wikimedium.trunk.Standard.nd10M_dps.png, 
 wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png, 
 wikimedium.trunk.Standard.nd10M_dps_addDocuments.png


 We should run indexing performance tests with the DWPT changes and compare to 
 trunk.
 We need to test both single-threaded and multi-threaded performance.
 NOTE:  flush by RAM isn't implemented just yet, so either we wait with the 
 tests or flush by doc count.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch


[ 
https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020230#comment-13020230
 ] 

Simon Willnauer commented on LUCENE-2571:
-

bq. Well.. merges tie up IO (especially if not on fancy SSDs/RAIDs), which in 
turn lags flushes - bigger delays for stop the world flushes / lower bandwith 
cap (after which they are forced to stop the world) for parallel flushes.

True it will make a difference in certain situations but not for this benchmark 
RT does way more merges here since we are flushing way more segments. the time 
windows I used here is where we almost don't merge at all in the trunk run so 
it should not make a difference.

I ran those benchmarks again with BalancedSegmentMergePolicy and it doesn't 
make any difference really. see below
!wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png!

 Indexing performance tests with realtime branch
 ---

 Key: LUCENE-2571
 URL: https://issues.apache.org/jira/browse/LUCENE-2571
 Project: Lucene - Java
  Issue Type: Task
  Components: Index
Reporter: Michael Busch
Priority: Minor
 Fix For: Realtime Branch

 Attachments: wikimedium.realtime.Standard.nd10M_dps.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments.png, 
 wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, 
 wikimedium.trunk.Standard.nd10M_dps.png, 
 wikimedium.trunk.Standard.nd10M_dps_BalancedSegmentMergePolicy.png, 
 wikimedium.trunk.Standard.nd10M_dps_addDocuments.png


 We should run indexing performance tests with the DWPT changes and compare to 
 trunk.
 We need to test both single-threaded and multi-threaded performance.
 NOTE:  flush by RAM isn't implemented just yet, so either we wait with the 
 tests or flush by doc count.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3031) setFlushPending fails if we concurrently

setFlushPending fails if we concurrently 
-

 Key: LUCENE-3031
 URL: https://issues.apache.org/jira/browse/LUCENE-3031
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch


If we select a DWPT for flushing but that DWPT is currently in flight and hits 
an exception after we selected them for flushing the num of docs is reset to 0 
and we trip that exception. So we rather check if it is  0 than assert on it 
here.
{noformat}
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
[junit] Testcase: 
testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions): 
  FAILED
[junit] thread Indexer 3: hit unexpected failure
[junit] junit.framework.AssertionFailedError: thread Indexer 3: hit 
unexpected failure
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:227)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 30.287 sec
[junit] 
[junit] - Standard Output ---
[junit] Indexer 3: unexpected exception2
[junit] java.lang.AssertionError
[junit] at 
org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:170)
[junit] at 
org.apache.lucene.index.FlushPolicy.markLargestWriterPending(FlushPolicy.java:108)
[junit] at 
org.apache.lucene.index.FlushByRamOrCountsPolicy.onInsert(FlushByRamOrCountsPolicy.java:61)
[junit] at 
org.apache.lucene.index.FlushPolicy.onUpdate(FlushPolicy.java:77)
[junit] at 
org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:115)
[junit] at 
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:341)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1367)
[junit] at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1339)
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:92)
[junit] -  ---
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterExceptions 
-Dtestmethod=testRandomExceptionsThreads 
-Dtests.seed=3493970007652348212:2010109588873167237
[junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
running: merge thread: _1v(4.0):Cv2 _27(4.0):cv1 into _2h
[junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
running: merge thread: _2c(4.0):cv1 into _2m
[junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 2 
thread(s) running
[junit] NOTE: test params are: codec=RandomCodecProvider: 
{content=MockFixedIntBlock(blockSize=421), field=MockSep, id=SimpleText, 
other=MockSep, contents=MockRandom, content1=Pulsing(freqCutoff=11), 
content2=MockSep, content4=SimpleText, content5=SimpleText, 
content6=MockRandom, crash=MockRandom, 
content7=MockVariableIntBlock(baseBlockSize=109)}, locale=mk_MK, 
timezone=Europe/Malta
[junit] NOTE: all tests run in this JVM:
[junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
TestFilterIndexReader, TestIndexWriterExceptions]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
(64-bit)/cpus=8,threads=1,free=78897400,total=195821568
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3031) setFlushPending fails if we concurrently


 [ 
https://issues.apache.org/jira/browse/LUCENE-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3031:


Attachment: LUCENE-3031.patch

here is a patch... I will commit shortly

 setFlushPending fails if we concurrently 
 -

 Key: LUCENE-3031
 URL: https://issues.apache.org/jira/browse/LUCENE-3031
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3031.patch


 If we select a DWPT for flushing but that DWPT is currently in flight and 
 hits an exception after we selected them for flushing the num of docs is 
 reset to 0 and we trip that exception. So we rather check if it is  0 than 
 assert on it here.
 {noformat}
 [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
 [junit] Testcase: 
 testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions):
  FAILED
 [junit] thread Indexer 3: hit unexpected failure
 [junit] junit.framework.AssertionFailedError: thread Indexer 3: hit 
 unexpected failure
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:227)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 30.287 sec
 [junit] 
 [junit] - Standard Output ---
 [junit] Indexer 3: unexpected exception2
 [junit] java.lang.AssertionError
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:170)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.markLargestWriterPending(FlushPolicy.java:108)
 [junit]   at 
 org.apache.lucene.index.FlushByRamOrCountsPolicy.onInsert(FlushByRamOrCountsPolicy.java:61)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.onUpdate(FlushPolicy.java:77)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:115)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:341)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1367)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1339)
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:92)
 [junit] -  ---
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test 
 -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testRandomExceptionsThreads 
 -Dtests.seed=3493970007652348212:2010109588873167237
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _1v(4.0):Cv2 _27(4.0):cv1 into _2h
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _2c(4.0):cv1 into _2m
 [junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 2 
 thread(s) running
 [junit] NOTE: test params are: codec=RandomCodecProvider: 
 {content=MockFixedIntBlock(blockSize=421), field=MockSep, id=SimpleText, 
 other=MockSep, contents=MockRandom, content1=Pulsing(freqCutoff=11), 
 content2=MockSep, content4=SimpleText, content5=SimpleText, 
 content6=MockRandom, crash=MockRandom, 
 content7=MockVariableIntBlock(baseBlockSize=109)}, locale=mk_MK, 
 timezone=Europe/Malta
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
 TestFilterIndexReader, TestIndexWriterExceptions]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=78897400,total=195821568
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3032) TestIndexWriterException fails with NPE on realtime

TestIndexWriterException fails with NPE on realtime
---

 Key: LUCENE-3032
 URL: https://issues.apache.org/jira/browse/LUCENE-3032
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
 Fix For: Realtime Branch


{noformat}
   [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
[junit] Testcase: 
testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions): 
  Caused an ERROR
[junit] (null)
[junit] java.lang.NullPointerException
[junit] at 
org.apache.lucene.index.DocumentsWriterPerThread.prepareFlush(DocumentsWriterPerThread.java:329)
[junit] at 
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:378)
[junit] at 
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:512)
[junit] at 
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2619)
[junit] at 
org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2594)
[junit] at 
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2464)
[junit] at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2537)
[junit] at 
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2519)
[junit] at 
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2503)
[junit] at 
org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:230)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 22.548 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterExceptions 
-Dtestmethod=testRandomExceptionsThreads 
-Dtests.seed=-5079747362001734044:1572064802119081373
[junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
running: merge thread: _25(4.0):cv2/1 _29(4.0):cv2/1 _20(4.0):cv3/1 into _2m
[junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 1 
thread(s) running
[junit] NOTE: test params are: codec=RandomCodecProvider: 
{content=Pulsing(freqCutoff=2), field=MockSep, id=Pulsing(freqCutoff=2), 
other=MockSep, contents=SimpleText, content1=MockSep, content2=SimpleText, 
content4=MockRandom, content5=MockRandom, 
content6=MockVariableIntBlock(baseBlockSize=41), crash=Standard, 
content7=MockFixedIntBlock(blockSize=1633)}, locale=en_GB, timezone=Europe/Vaduz
[junit] NOTE: all tests run in this JVM:
[junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
TestFilterIndexReader, TestIndexWriterExceptions]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
(64-bit)/cpus=8,threads=1,free=155417240,total=292945920
[junit] -  ---
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3031) setFlushPending fails if we concurrently hit a aborting exception


 [ 
https://issues.apache.org/jira/browse/LUCENE-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3031:


Summary: setFlushPending fails if we concurrently hit a aborting exception  
(was: setFlushPending fails if we concurrently )

 setFlushPending fails if we concurrently hit a aborting exception
 -

 Key: LUCENE-3031
 URL: https://issues.apache.org/jira/browse/LUCENE-3031
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3031.patch


 If we select a DWPT for flushing but that DWPT is currently in flight and 
 hits an exception after we selected them for flushing the num of docs is 
 reset to 0 and we trip that exception. So we rather check if it is  0 than 
 assert on it here.
 {noformat}
 [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
 [junit] Testcase: 
 testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions):
  FAILED
 [junit] thread Indexer 3: hit unexpected failure
 [junit] junit.framework.AssertionFailedError: thread Indexer 3: hit 
 unexpected failure
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:227)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 30.287 sec
 [junit] 
 [junit] - Standard Output ---
 [junit] Indexer 3: unexpected exception2
 [junit] java.lang.AssertionError
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:170)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.markLargestWriterPending(FlushPolicy.java:108)
 [junit]   at 
 org.apache.lucene.index.FlushByRamOrCountsPolicy.onInsert(FlushByRamOrCountsPolicy.java:61)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.onUpdate(FlushPolicy.java:77)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:115)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:341)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1367)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1339)
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:92)
 [junit] -  ---
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test 
 -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testRandomExceptionsThreads 
 -Dtests.seed=3493970007652348212:2010109588873167237
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _1v(4.0):Cv2 _27(4.0):cv1 into _2h
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _2c(4.0):cv1 into _2m
 [junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 2 
 thread(s) running
 [junit] NOTE: test params are: codec=RandomCodecProvider: 
 {content=MockFixedIntBlock(blockSize=421), field=MockSep, id=SimpleText, 
 other=MockSep, contents=MockRandom, content1=Pulsing(freqCutoff=11), 
 content2=MockSep, content4=SimpleText, content5=SimpleText, 
 content6=MockRandom, crash=MockRandom, 
 content7=MockVariableIntBlock(baseBlockSize=109)}, locale=mk_MK, 
 timezone=Europe/Malta
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
 TestFilterIndexReader, TestIndexWriterExceptions]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=78897400,total=195821568
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3032) TestIndexWriterException fails with NPE on realtime


 [ 
https://issues.apache.org/jira/browse/LUCENE-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-3032:


Attachment: LUCENE-3032.patch

if we never succeed adding a document and hit only non-aborting exceptions the 
delete slice gets never initialized which is ok but needs to be checked in the 
prepare flush method.

here is a patch

 TestIndexWriterException fails with NPE on realtime
 ---

 Key: LUCENE-3032
 URL: https://issues.apache.org/jira/browse/LUCENE-3032
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3032.patch


 {noformat}
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
 [junit] Testcase: 
 testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions):
  Caused an ERROR
 [junit] (null)
 [junit] java.lang.NullPointerException
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterPerThread.prepareFlush(DocumentsWriterPerThread.java:329)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:378)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:512)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2619)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2594)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2464)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2537)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2519)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2503)
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:230)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 22.548 sec
 [junit] 
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test 
 -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testRandomExceptionsThreads 
 -Dtests.seed=-5079747362001734044:1572064802119081373
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _25(4.0):cv2/1 _29(4.0):cv2/1 _20(4.0):cv3/1 into _2m
 [junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 1 
 thread(s) running
 [junit] NOTE: test params are: codec=RandomCodecProvider: 
 {content=Pulsing(freqCutoff=2), field=MockSep, id=Pulsing(freqCutoff=2), 
 other=MockSep, contents=SimpleText, content1=MockSep, content2=SimpleText, 
 content4=MockRandom, content5=MockRandom, 
 content6=MockVariableIntBlock(baseBlockSize=41), crash=Standard, 
 content7=MockFixedIntBlock(blockSize=1633)}, locale=en_GB, 
 timezone=Europe/Vaduz
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
 TestFilterIndexReader, TestIndexWriterExceptions]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=155417240,total=292945920
 [junit] -  ---
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3031) setFlushPending fails if we concurrently hit a aborting exception


 [ 
https://issues.apache.org/jira/browse/LUCENE-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3031.
-

Resolution: Fixed

committed to branch

 setFlushPending fails if we concurrently hit a aborting exception
 -

 Key: LUCENE-3031
 URL: https://issues.apache.org/jira/browse/LUCENE-3031
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3031.patch


 If we select a DWPT for flushing but that DWPT is currently in flight and 
 hits an exception after we selected them for flushing the num of docs is 
 reset to 0 and we trip that exception. So we rather check if it is  0 than 
 assert on it here.
 {noformat}
 [junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
 [junit] Testcase: 
 testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions):
  FAILED
 [junit] thread Indexer 3: hit unexpected failure
 [junit] junit.framework.AssertionFailedError: thread Indexer 3: hit 
 unexpected failure
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:227)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 30.287 sec
 [junit] 
 [junit] - Standard Output ---
 [junit] Indexer 3: unexpected exception2
 [junit] java.lang.AssertionError
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:170)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.markLargestWriterPending(FlushPolicy.java:108)
 [junit]   at 
 org.apache.lucene.index.FlushByRamOrCountsPolicy.onInsert(FlushByRamOrCountsPolicy.java:61)
 [junit]   at 
 org.apache.lucene.index.FlushPolicy.onUpdate(FlushPolicy.java:77)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:115)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:341)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1367)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1339)
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions$IndexerThread.run(TestIndexWriterExceptions.java:92)
 [junit] -  ---
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test 
 -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testRandomExceptionsThreads 
 -Dtests.seed=3493970007652348212:2010109588873167237
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _1v(4.0):Cv2 _27(4.0):cv1 into _2h
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _2c(4.0):cv1 into _2m
 [junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 2 
 thread(s) running
 [junit] NOTE: test params are: codec=RandomCodecProvider: 
 {content=MockFixedIntBlock(blockSize=421), field=MockSep, id=SimpleText, 
 other=MockSep, contents=MockRandom, content1=Pulsing(freqCutoff=11), 
 content2=MockSep, content4=SimpleText, content5=SimpleText, 
 content6=MockRandom, crash=MockRandom, 
 content7=MockVariableIntBlock(baseBlockSize=109)}, locale=mk_MK, 
 timezone=Europe/Malta
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
 TestFilterIndexReader, TestIndexWriterExceptions]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=78897400,total=195821568
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3032) TestIndexWriterException fails with NPE on realtime


 [ 
https://issues.apache.org/jira/browse/LUCENE-3032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3032.
-

Resolution: Fixed

committed to branch

 TestIndexWriterException fails with NPE on realtime
 ---

 Key: LUCENE-3032
 URL: https://issues.apache.org/jira/browse/LUCENE-3032
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3032.patch


 {noformat}
[junit] Testsuite: org.apache.lucene.index.TestIndexWriterExceptions
 [junit] Testcase: 
 testRandomExceptionsThreads(org.apache.lucene.index.TestIndexWriterExceptions):
  Caused an ERROR
 [junit] (null)
 [junit] java.lang.NullPointerException
 [junit]   at 
 org.apache.lucene.index.DocumentsWriterPerThread.prepareFlush(DocumentsWriterPerThread.java:329)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:378)
 [junit]   at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:512)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2619)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2594)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2464)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2537)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2519)
 [junit]   at 
 org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2503)
 [junit]   at 
 org.apache.lucene.index.TestIndexWriterExceptions.testRandomExceptionsThreads(TestIndexWriterExceptions.java:230)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 0, Errors: 1, Time elapsed: 22.548 sec
 [junit] 
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test 
 -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testRandomExceptionsThreads 
 -Dtests.seed=-5079747362001734044:1572064802119081373
 [junit] WARNING: test method: 'testRandomExceptionsThreads' left thread 
 running: merge thread: _25(4.0):cv2/1 _29(4.0):cv2/1 _20(4.0):cv3/1 into _2m
 [junit] RESOURCE LEAK: test method: 'testRandomExceptionsThreads' left 1 
 thread(s) running
 [junit] NOTE: test params are: codec=RandomCodecProvider: 
 {content=Pulsing(freqCutoff=2), field=MockSep, id=Pulsing(freqCutoff=2), 
 other=MockSep, contents=SimpleText, content1=MockSep, content2=SimpleText, 
 content4=MockRandom, content5=MockRandom, 
 content6=MockVariableIntBlock(baseBlockSize=41), crash=Standard, 
 content7=MockFixedIntBlock(blockSize=1633)}, locale=en_GB, 
 timezone=Europe/Vaduz
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes, 
 TestFilterIndexReader, TestIndexWriterExceptions]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=155417240,total=292945920
 [junit] -  ---
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3023) Land DWPT on trunk

[
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020241#comment-13020241
]

Simon Willnauer commented on LUCENE-3023:
-

selckin thanks for reporting those failures... I fixed the
TestIndexWriterException ones in LUCENE-3031 and LUCENE-3032. The TestOmitTf
failure caused by a recently fixed bug on trunk (LUCENE-3027) which I haven't
merged into RT branch yet. I just did the merge and that fixes that issue too.
I will commit the merge in a minute

The issues you are reporting with addIndexes I can not reproduce though... I
will spin off issues for them.

Land DWPT on trunk
--

Key: LUCENE-3023
URL: https://issues.apache.org/jira/browse/LUCENE-3023
Project: Lucene - Java
Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Fix For: 4.0

Attachments: realtime-TestAddIndexes-3.txt,
realtime-TestAddIndexes-5.txt,
realtime-TestIndexWriterExceptions-assert-6.txt,
realtime-TestIndexWriterExceptions-npe-1.txt,
realtime-TestIndexWriterExceptions-npe-2.txt,
realtime-TestIndexWriterExceptions-npe-4.txt,
realtime-TestOmitTf-corrupt-0.txt

With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so
we can proceed landing the DWPT development on trunk soon. I think one of the
bigger issues here is to make sure that all JavaDocs for IW etc. are still
correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Varun Thacker (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated LUCENE-3018:
--

Attachment: cpptasks.jar
cpptasks-LICENSE-ASL.txt

I built cpptasks-1.0b5 using jdk-1.5 . I have also attached the LICENSE file by 
renaming it according to license naming convention. 

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3033) TestAddIndexes#testAddIndexesWithThreads fails on Realtime

TestAddIndexes#testAddIndexesWithThreads fails on Realtime
--

 Key: LUCENE-3033
 URL: https://issues.apache.org/jira/browse/LUCENE-3033
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
 Fix For: Realtime Branch


Selckin reported two failures on LUCENE-3023 which I can unfortunately not 
reproduce at all. here are the traces

{noformat}
  [junit] Testsuite: org.apache.lucene.index.TestAddIndexes
[junit] Testcase: 
testAddIndexesWithThreads(org.apache.lucene.index.TestAddIndexes):FAILED
[junit] expected:3160 but was:3060
[junit] junit.framework.AssertionFailedError: expected:3160 but was:3060
[junit] at 
org.apache.lucene.index.TestAddIndexes.testAddIndexesWithThreads(TestAddIndexes.java:783)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 14.272 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestAddIndexes 
-Dtestmethod=testAddIndexesWithThreads 
-Dtests.seed=6128854208955988865:2552774338676281184
[junit] NOTE: test params are: codec=PreFlex, locale=no_NO_NY, 
timezone=America/Edmonton
[junit] NOTE: all tests run in this JVM:
[junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
(64-bit)/cpus=8,threads=1,free=84731792,total=258080768
[junit] -  ---
{noformat}
and 
{noformat}
[junit] Testsuite: org.apache.lucene.index.TestAddIndexes
[junit] Testcase: 
testAddIndexesWithThreads(org.apache.lucene.index.TestAddIndexes):FAILED
[junit] expected:3160 but was:3060
[junit] junit.framework.AssertionFailedError: expected:3160 but was:3060
[junit] at 
org.apache.lucene.index.TestAddIndexes.testAddIndexesWithThreads(TestAddIndexes.java:783)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
[junit] 
[junit] 
[junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 14.841 sec
[junit] 
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestAddIndexes 
-Dtestmethod=testAddIndexesWithThreads 
-Dtests.seed=4502815121171887759:-6764285049309266272
[junit] NOTE: test params are: codec=PreFlex, locale=tr_TR, 
timezone=Mexico/BajaNorte
[junit] NOTE: all tests run in this JVM:
[junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes]
[junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
(64-bit)/cpus=8,threads=1,free=163663416,total=243335168
[junit] -  ---
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3033) TestAddIndexes#testAddIndexesWithThreads fails on Realtime


[ 
https://issues.apache.org/jira/browse/LUCENE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020256#comment-13020256
 ] 

Simon Willnauer commented on LUCENE-3033:
-

After 900 runs I stepped into this error but only since I got an OOM during 
merge is it possible that such an error is not printed out due to not enough 
memory? I think we hit an OOM during the handle method in CommitAndAddIndexes 
this could happen no? If so I think this is a false positive

 TestAddIndexes#testAddIndexesWithThreads fails on Realtime
 --

 Key: LUCENE-3033
 URL: https://issues.apache.org/jira/browse/LUCENE-3033
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
 Fix For: Realtime Branch


 Selckin reported two failures on LUCENE-3023 which I can unfortunately not 
 reproduce at all. here are the traces
 {noformat}
   [junit] Testsuite: org.apache.lucene.index.TestAddIndexes
 [junit] Testcase: 
 testAddIndexesWithThreads(org.apache.lucene.index.TestAddIndexes):  FAILED
 [junit] expected:3160 but was:3060
 [junit] junit.framework.AssertionFailedError: expected:3160 but 
 was:3060
 [junit]   at 
 org.apache.lucene.index.TestAddIndexes.testAddIndexesWithThreads(TestAddIndexes.java:783)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 14.272 sec
 [junit] 
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test -Dtestcase=TestAddIndexes 
 -Dtestmethod=testAddIndexesWithThreads 
 -Dtests.seed=6128854208955988865:2552774338676281184
 [junit] NOTE: test params are: codec=PreFlex, locale=no_NO_NY, 
 timezone=America/Edmonton
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=84731792,total=258080768
 [junit] -  ---
 {noformat}
 and 
 {noformat}
 [junit] Testsuite: org.apache.lucene.index.TestAddIndexes
 [junit] Testcase: 
 testAddIndexesWithThreads(org.apache.lucene.index.TestAddIndexes):  FAILED
 [junit] expected:3160 but was:3060
 [junit] junit.framework.AssertionFailedError: expected:3160 but 
 was:3060
 [junit]   at 
 org.apache.lucene.index.TestAddIndexes.testAddIndexesWithThreads(TestAddIndexes.java:783)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
 [junit]   at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
 [junit] 
 [junit] 
 [junit] Tests run: 18, Failures: 1, Errors: 0, Time elapsed: 14.841 sec
 [junit] 
 [junit] - Standard Error -
 [junit] NOTE: reproduce with: ant test -Dtestcase=TestAddIndexes 
 -Dtestmethod=testAddIndexesWithThreads 
 -Dtests.seed=4502815121171887759:-6764285049309266272
 [junit] NOTE: test params are: codec=PreFlex, locale=tr_TR, 
 timezone=Mexico/BajaNorte
 [junit] NOTE: all tests run in this JVM:
 [junit] [TestToken, TestDateTools, Test2BTerms, TestAddIndexes]
 [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_24 
 (64-bit)/cpus=8,threads=1,free=163663416,total=243335168
 [junit] -  ---
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #93: POMs out of sync

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-3.x/93/

No tests ran.

Build Log (for compile errors):
[...truncated 7800 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread Steven A Rowe

As a result of Robert Muir's r1092398 commit on branch_3x (the Latvian analysis 
stuff), the build is now broken under Sun/Oracle JDK 1.5:

  [javadoc] 
C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common\src\java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118:
 warning - Tag @link: reference not found: 
org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponents
[...]
  [javadoc] 5 warnings

BUILD FAILED
C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The following error 
occurred while executing this line:
C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813: Javadocs 
warnings were found!

Robert said on #lucene IRC:
 (9:02:53 AM) rmuir: its a bug in the EOL'ed java5
 (9:02:57 AM) rmuir: im not fixing it
 (9:03:21 AM) rmuir: build isnt broken
 (9:04:07 AM) rmuir: jre bug = not my problem
 (9:04:32 AM) rmuir: complain to oracle, having hit another one of their bugs 
 at 3am last night, ive had it with their crap
 (9:04:42 AM) rmuir: i didnt break the build
 (9:04:43 AM) rmuir: its a bug in your jre

Needless to say, I disagree vehemently with Robert.

I think Robert should fix this or his commit should be reverted.

I'm interested in hearing other opinions on this.

Steve

Re: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread DM Smith

What is the bug? I have an interest in this component and am willing to 
see about fixing it. It appears that it is a Javadoc bug??? Why would we 
keep good code out for that?


-- DM

On 04/15/2011 09:15 AM, Steven A Rowe wrote:

As a result of Robert Muir's r1092398 commit on branch_3x (the Latvian analysis 
stuff), the build is now broken under Sun/Oracle JDK 1.5:

   [javadoc] 
C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common\src\java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118:
 warning - Tag @link: reference not found: 
org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponents
[...]
   [javadoc] 5 warnings

BUILD FAILED
C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The following error 
occurred while executing this line:
C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813: Javadocs 
warnings were found!

Robert said on #lucene IRC:

(9:02:53 AM) rmuir: its a bug in the EOL'ed java5
(9:02:57 AM) rmuir: im not fixing it
(9:03:21 AM) rmuir: build isnt broken
(9:04:07 AM) rmuir: jre bug = not my problem
(9:04:32 AM) rmuir: complain to oracle, having hit another one of their bugs at 
3am last night, ive had it with their crap
(9:04:42 AM) rmuir: i didnt break the build
(9:04:43 AM) rmuir: its a bug in your jre

Needless to say, I disagree vehemently with Robert.

I think Robert should fix this or his commit should be reverted.

I'm interested in hearing other opinions on this.

Steve







-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread Steven A Rowe

Turns out that the problem is that ReusableAnalyzerBase is not in the same 
package in branch_3x as on trunk, so the class reference is just wrong on 
branch_3x, regardless of the JDK one uses to generate javadocs.  (Oracle 
1.6.0_21 JDK triggers the same failure on branch_3x.)

Robert, I trust you will see your way clear to fixing this, regardless of your 
lack of interest in supporting Oracle's buggy implementations.

But I still think the issue should be resolved: Robert thinks he doesn't have 
to play by the rules (JDK 1.5 support for Lucene, javadocs warnings fail the 
build, failed builds are not allowed), and I think that's unacceptable.  I 
think the proper course of action is calling a vote to change the rules, not 
just blithely ignoring them because you don't feel like following them.

Unhappily,
Steve

 -Original Message-
 From: DM Smith [mailto:dmsmith...@gmail.com]
 Sent: Friday, April 15, 2011 9:31 AM
 To: dev@lucene.apache.org
 Subject: Re: Robert Muir thinks we should stop supporting Sun/Oracle JDK
 1.5 on branch_3x
 
 What is the bug? I have an interest in this component and am willing to
 see about fixing it. It appears that it is a Javadoc bug??? Why would we
 keep good code out for that?
 
 -- DM
 
 On 04/15/2011 09:15 AM, Steven A Rowe wrote:
  As a result of Robert Muir's r1092398 commit on branch_3x (the Latvian
 analysis stuff), the build is now broken under Sun/Oracle JDK 1.5:
 
 [javadoc]
 C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common\src\
 java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118: warning -
 Tag @link: reference not found:
 org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponent
 s
  [...]
 [javadoc] 5 warnings
 
  BUILD FAILED
  C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The
 following error occurred while executing this line:
  C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813:
 Javadocs warnings were found!
 
  Robert said on #lucene IRC:
  (9:02:53 AM) rmuir: its a bug in the EOL'ed java5
  (9:02:57 AM) rmuir: im not fixing it
  (9:03:21 AM) rmuir: build isnt broken
  (9:04:07 AM) rmuir: jre bug = not my problem
  (9:04:32 AM) rmuir: complain to oracle, having hit another one of
 their bugs at 3am last night, ive had it with their crap
  (9:04:42 AM) rmuir: i didnt break the build
  (9:04:43 AM) rmuir: its a bug in your jre
  Needless to say, I disagree vehemently with Robert.
 
  I think Robert should fix this or his commit should be reverted.
 
  I'm interested in hearing other opinions on this.
 
  Steve

RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread Uwe Schindler

/me is unhappy, too. By the way, we use Java 5 also in Lucene trunk!

In most cases such Javadoc bugs can be circumvented by using absolute class 
names.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Steven A Rowe [mailto:sar...@syr.edu]
 Sent: Friday, April 15, 2011 3:54 PM
 To: dev@lucene.apache.org
 Subject: RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK
 1.5 on branch_3x
 
 Turns out that the problem is that ReusableAnalyzerBase is not in the same
 package in branch_3x as on trunk, so the class reference is just wrong on
 branch_3x, regardless of the JDK one uses to generate javadocs.  (Oracle
 1.6.0_21 JDK triggers the same failure on branch_3x.)
 
 Robert, I trust you will see your way clear to fixing this, regardless of 
 your lack
 of interest in supporting Oracle's buggy implementations.
 
 But I still think the issue should be resolved: Robert thinks he doesn't have 
 to
 play by the rules (JDK 1.5 support for Lucene, javadocs warnings fail the 
 build,
 failed builds are not allowed), and I think that's unacceptable.  I think the
 proper course of action is calling a vote to change the rules, not just 
 blithely
 ignoring them because you don't feel like following them.
 
 Unhappily,
 Steve
 
  -Original Message-
  From: DM Smith [mailto:dmsmith...@gmail.com]
  Sent: Friday, April 15, 2011 9:31 AM
  To: dev@lucene.apache.org
  Subject: Re: Robert Muir thinks we should stop supporting Sun/Oracle
  JDK
  1.5 on branch_3x
 
  What is the bug? I have an interest in this component and am willing
  to see about fixing it. It appears that it is a Javadoc bug??? Why
  would we keep good code out for that?
 
  -- DM
 
  On 04/15/2011 09:15 AM, Steven A Rowe wrote:
   As a result of Robert Muir's r1092398 commit on branch_3x (the
   Latvian
  analysis stuff), the build is now broken under Sun/Oracle JDK 1.5:
  
  [javadoc]
 
 C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common
 \s
  rc\
  java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118: warning -
  Tag @link: reference not found:
 
 org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamCompon
  ent
  s
   [...]
  [javadoc] 5 warnings
  
   BUILD FAILED
   C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The
  following error occurred while executing this line:
   C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813:
  Javadocs warnings were found!
  
   Robert said on #lucene IRC:
   (9:02:53 AM) rmuir: its a bug in the EOL'ed java5
   (9:02:57 AM) rmuir: im not fixing it
   (9:03:21 AM) rmuir: build isnt broken
   (9:04:07 AM) rmuir: jre bug = not my problem
   (9:04:32 AM) rmuir: complain to oracle, having hit another one of
  their bugs at 3am last night, ive had it with their crap
   (9:04:42 AM) rmuir: i didnt break the build
   (9:04:43 AM) rmuir: its a bug in your jre
   Needless to say, I disagree vehemently with Robert.
  
   I think Robert should fix this or his commit should be reverted.
  
   I'm interested in hearing other opinions on this.
  
   Steve


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread Robert Muir

hi,

my code is 100% valid java 5. if this oracle bug bothers you so much,
perhaps you should switch to ibm's j9.

this is an oracle specific bug, again there is nothing wrong with my code
technically, thus no justification for revert.

remember, working around bugs in particular jre implementations is optional,
I've helped with this before but we need to make sure everyone understands
this is not mandatory.

I code to the spec
On Apr 15, 2011 9:55 AM, Steven A Rowe sar...@syr.edu wrote:
 Turns out that the problem is that ReusableAnalyzerBase is not in the same
package in branch_3x as on trunk, so the class reference is just wrong on
branch_3x, regardless of the JDK one uses to generate javadocs. (Oracle
1.6.0_21 JDK triggers the same failure on branch_3x.)

 Robert, I trust you will see your way clear to fixing this, regardless of
your lack of interest in supporting Oracle's buggy implementations.

 But I still think the issue should be resolved: Robert thinks he doesn't
have to play by the rules (JDK 1.5 support for Lucene, javadocs warnings
fail the build, failed builds are not allowed), and I think that's
unacceptable. I think the proper course of action is calling a vote to
change the rules, not just blithely ignoring them because you don't feel
like following them.

 Unhappily,
 Steve

 -Original Message-
 From: DM Smith [mailto:dmsmith...@gmail.com]
 Sent: Friday, April 15, 2011 9:31 AM
 To: dev@lucene.apache.org
 Subject: Re: Robert Muir thinks we should stop supporting Sun/Oracle JDK
 1.5 on branch_3x

 What is the bug? I have an interest in this component and am willing to
 see about fixing it. It appears that it is a Javadoc bug??? Why would we
 keep good code out for that?

 -- DM

 On 04/15/2011 09:15 AM, Steven A Rowe wrote:
  As a result of Robert Muir's r1092398 commit on branch_3x (the Latvian
 analysis stuff), the build is now broken under Sun/Oracle JDK 1.5:
 
  [javadoc]
 C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common\src\
 java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118: warning -
 Tag @link: reference not found:
 org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponent
 s
  [...]
  [javadoc] 5 warnings
 
  BUILD FAILED
  C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The
 following error occurred while executing this line:
  C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813:
 Javadocs warnings were found!
 
  Robert said on #lucene IRC:
  (9:02:53 AM) rmuir: its a bug in the EOL'ed java5
  (9:02:57 AM) rmuir: im not fixing it
  (9:03:21 AM) rmuir: build isnt broken
  (9:04:07 AM) rmuir: jre bug = not my problem
  (9:04:32 AM) rmuir: complain to oracle, having hit another one of
 their bugs at 3am last night, ive had it with their crap
  (9:04:42 AM) rmuir: i didnt break the build
  (9:04:43 AM) rmuir: its a bug in your jre
  Needless to say, I disagree vehemently with Robert.
 
  I think Robert should fix this or his commit should be reverted.
 
  I'm interested in hearing other opinions on this.
 
  Steve

RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on branch_3x

2011-04-15 Thread Steven A Rowe

Robert, did you read my email at all?

Your javadoc comment has a link to a non-existent class.  That’s not “valid” 
under anybody’s jdk, regardless of version.  Try running ‘ant javadoc’ under 
your favorite jre on branch_3x.  No worky, dude.

The fix is to remove “.util” from the package name.  Doing so will not violate 
your principles.

Steve

From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Friday, April 15, 2011 10:52 AM
To: dev@lucene.apache.org
Subject: RE: Robert Muir thinks we should stop supporting Sun/Oracle JDK 1.5 on 
branch_3x


hi,

my code is 100% valid java 5. if this oracle bug bothers you so much, perhaps 
you should switch to ibm's j9.

this is an oracle specific bug, again there is nothing wrong with my code 
technically, thus no justification for revert.

remember, working around bugs in particular jre implementations is optional, 
I've helped with this before but we need to make sure everyone understands this 
is not mandatory.

I code to the spec
On Apr 15, 2011 9:55 AM, Steven A Rowe 
sar...@syr.edumailto:sar...@syr.edu wrote:
 Turns out that the problem is that ReusableAnalyzerBase is not in the same 
 package in branch_3x as on trunk, so the class reference is just wrong on 
 branch_3x, regardless of the JDK one uses to generate javadocs. (Oracle 
 1.6.0_21 JDK triggers the same failure on branch_3x.)

 Robert, I trust you will see your way clear to fixing this, regardless of 
 your lack of interest in supporting Oracle's buggy implementations.

 But I still think the issue should be resolved: Robert thinks he doesn't have 
 to play by the rules (JDK 1.5 support for Lucene, javadocs warnings fail the 
 build, failed builds are not allowed), and I think that's unacceptable. I 
 think the proper course of action is calling a vote to change the rules, not 
 just blithely ignoring them because you don't feel like following them.

 Unhappily,
 Steve

 -Original Message-
 From: DM Smith [mailto:dmsmith...@gmail.commailto:dmsmith...@gmail.com]
 Sent: Friday, April 15, 2011 9:31 AM
 To: dev@lucene.apache.orgmailto:dev@lucene.apache.org
 Subject: Re: Robert Muir thinks we should stop supporting Sun/Oracle JDK
 1.5 on branch_3x

 What is the bug? I have an interest in this component and am willing to
 see about fixing it. It appears that it is a Javadoc bug??? Why would we
 keep good code out for that?

 -- DM

 On 04/15/2011 09:15 AM, Steven A Rowe wrote:
  As a result of Robert Muir's r1092398 commit on branch_3x (the Latvian
 analysis stuff), the build is now broken under Sun/Oracle JDK 1.5:
 
  [javadoc]
 C:\svn\lucene\dev\branches\branch_3x\lucene\contrib\analyzers\common\src\
 java\org\apache\lucene\analysis\lv\LatvianAnalyzer.java:118: warning -
 Tag @link: reference not found:
 org.apache.lucene.analysis.util.ReusableAnalyzerBase.TokenStreamComponent
 s
  [...]
  [javadoc] 5 warnings
 
  BUILD FAILED
  C:\svn\lucene\dev\branches\branch_3x\lucene\build.xml:217: The
 following error occurred while executing this line:
  C:\svn\lucene\dev\branches\branch_3x\lucene\common-build.xml:813:
 Javadocs warnings were found!
 
  Robert said on #lucene IRC:
  (9:02:53 AM) rmuir: its a bug in the EOL'ed java5
  (9:02:57 AM) rmuir: im not fixing it
  (9:03:21 AM) rmuir: build isnt broken
  (9:04:07 AM) rmuir: jre bug = not my problem
  (9:04:32 AM) rmuir: complain to oracle, having hit another one of
 their bugs at 3am last night, ive had it with their crap
  (9:04:42 AM) rmuir: i didnt break the build
  (9:04:43 AM) rmuir: its a bug in your jre
  Needless to say, I disagree vehemently with Robert.
 
  I think Robert should fix this or his commit should be reverted.
 
  I'm interested in hearing other opinions on this.
 
  Steve

Re: finding exceptions the crash pylucene

2011-04-15 Thread Marcus

Bill:
I'm not sure I follow.
why would raising the JVM memory to 4GB ever cause a crash in python?
Our server has 48GB.

thanks
Marcus

On Fri, Apr 15, 2011 at 7:33 AM, Bill Janssen jans...@parc.com wrote:

 Marcus qwe...@gmail.com wrote:

  --bcaec53043296dfbfd04a0ece1ac
  Content-Type: text/plain; charset=ISO-8859-1
 
  we're currently using 4GB max heap.
  We recently moved from 2GB to 4GB when we discovered it prevented a crash
  with a certain set of docs.
  Marcus

 I've tried the same workaround with the heap in the past, and I found it
 caused NoMemory crashes in the Python side of the house, because the
 Python VM couldn't get enough memory to operate.  So, be careful.

  On Thu, Apr 14, 2011 at 5:01 PM, Andi Vajda va...@apache.org wrote:
 
  
   On Thu, 14 Apr 2011, Marcus wrote:
  
thanks.
  
   I have documents that will consistently cause this upon writing them
 to
   the
   index. let me see if I can reduce them down to the crux of the crash.
   granted, these are docs are very large, unruly bad data, that should
   have
   never gotten this stage in our pipeline, but I was hoping for a java
 or
   lucene exception.
  
   I also get Java GC overhead exceptions passed into my code from time
 to
   time, but those manageable, and not crashes.
  
   Are there known memory constraint scenarios that force a c++
 exception,
   whereas in a normal Java environment,  you would get a memory error?
  
  
   Not sure.
  
  
and just confirming, do java.lang.OutOfMemoryError errors pass into
   python, or force a crash?
  
  
   Not sure, I've never seen these as I make sure I've got enough memory.
   initVM() is the place where you can configure the memory for your JVM.
  
   Andi..
  
  
  
   thanks again
   Marcus
  
   On Thu, Apr 14, 2011 at 2:07 PM, Andi Vajda va...@apache.org wrote:
  
  
   On Thu, 14 Apr 2011, Marcus wrote:
  
in certain cases when a java/pylucene exception occurs,  it gets
 passed
   up
  
   in my code, and I'm able to analyze the situation.
   sometimes though,  the python process just crashes, and if I happen
 to
   be
   in
   top (linux top that is), I see a JCC exception flash up in the top
   console.
   where can I go to look for this exception, or is it just lost?
   I looked in the locations where a java crash would be located, but
   didn't
   find anything.
  
  
   If you're hitting a crash because of an unhandled C++ exception,
 running
   a
   debug build with symbols under gdb will help greatly in tracking it
 down.
  
   An unhandled C++ exception would be a PyLucene/JCC bug. If you have a
   simple way to reproduce this failure, send it to this list.
  
   Andi..
  
  
  
 
  --bcaec53043296dfbfd04a0ece1ac--

[jira] [Commented] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020331#comment-13020331
 ] 

Steven Rowe commented on LUCENE-3018:
-

bq. I built cpptasks-1.0b5 using jdk-1.5 . I have also attached the LICENSE 
file by renaming it according to license naming convention. 

Excellent.

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Merge failing on Open Files

2011-04-15 Thread Grant Ingersoll

On Apr 14, 2011, at 3:24 PM, Simon Willnauer wrote:

 On Wed, Apr 6, 2011 at 8:44 PM, Grant Ingersoll gsing...@apache.org wrote:

 Begin forwarded message:

 From: Michael McCandless luc...@mikemccandless.com
 Date: April 5, 2011 5:46:13 AM EDT
 To: simon.willna...@gmail.com
 Cc: Simon Willnauer simon.willna...@googlemail.com,
 java-u...@lucene.apache.org, paul_t...@fastmail.fm
 Subject: Re: Lucene Merge failing on Open Files
 Reply-To: java-u...@lucene.apache.org

 Yeah, that mergeFactor is way too high and will cause
 too-many-open-files (if the index has enough segments).

 This is one of the things that has always bothered me about Merge Factor.
  We state what the lower bound is, but we don't doc the upper bound.
 Should we even allow higher values?  Of course, how does one pick the
 cutoff?  I've seen up to about 100 be effective.  But 3000 is a bit high
 (although, who knows what the future will hold)

 grant, we can at least add some documentation no?

Definitely!  Docs good. 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: finding exceptions the crash pylucene

2011-04-15 Thread Roman Chyla

I have had similar experience, but it was always a problem on the java side.
What helped was to dump memory:

-Xms512m -Xmx4500m -XX:+HeapDumpOnCtrlBreak -XX:+HeapDumpOnOutOfMemoryError

Documentation says that upon catching the OOM, you should stop the JVM
immediately. But actually it was possible to handle these problems. I
started the processing inside a separate thread, cleaning properly --
if the thread raises OOM, it is possible to continue - I have done
tests on thousands of docs and it always worked. But the main benefit
of that solution is that I can see the errors inside Python and
gracefully stop execution (without being shut out into the space).
Marcus, I would recommend wrapping your processing inside a thread
that starts another worker thread and make sure no references are
kept.

Roman

On Fri, Apr 15, 2011 at 4:33 PM, Bill Janssen jans...@parc.com wrote:
 Marcus qwe...@gmail.com wrote:

 --bcaec53043296dfbfd04a0ece1ac
 Content-Type: text/plain; charset=ISO-8859-1

 we're currently using 4GB max heap.
 We recently moved from 2GB to 4GB when we discovered it prevented a crash
 with a certain set of docs.
 Marcus

 I've tried the same workaround with the heap in the past, and I found it
 caused NoMemory crashes in the Python side of the house, because the
 Python VM couldn't get enough memory to operate.  So, be careful.

 On Thu, Apr 14, 2011 at 5:01 PM, Andi Vajda va...@apache.org wrote:

 
  On Thu, 14 Apr 2011, Marcus wrote:
 
   thanks.
 
  I have documents that will consistently cause this upon writing them to
  the
  index. let me see if I can reduce them down to the crux of the crash.
  granted, these are docs are very large, unruly bad data, that should
  have
  never gotten this stage in our pipeline, but I was hoping for a java or
  lucene exception.
 
  I also get Java GC overhead exceptions passed into my code from time to
  time, but those manageable, and not crashes.
 
  Are there known memory constraint scenarios that force a c++ exception,
  whereas in a normal Java environment,  you would get a memory error?
 
 
  Not sure.
 
 
   and just confirming, do java.lang.OutOfMemoryError errors pass into
  python, or force a crash?
 
 
  Not sure, I've never seen these as I make sure I've got enough memory.
  initVM() is the place where you can configure the memory for your JVM.
 
  Andi..
 
 
 
  thanks again
  Marcus
 
  On Thu, Apr 14, 2011 at 2:07 PM, Andi Vajda va...@apache.org wrote:
 
 
  On Thu, 14 Apr 2011, Marcus wrote:
 
   in certain cases when a java/pylucene exception occurs,  it gets passed
  up
 
  in my code, and I'm able to analyze the situation.
  sometimes though,  the python process just crashes, and if I happen to
  be
  in
  top (linux top that is), I see a JCC exception flash up in the top
  console.
  where can I go to look for this exception, or is it just lost?
  I looked in the locations where a java crash would be located, but
  didn't
  find anything.
 
 
  If you're hitting a crash because of an unhandled C++ exception, running
  a
  debug build with symbols under gdb will help greatly in tracking it down.
 
  An unhandled C++ exception would be a PyLucene/JCC bug. If you have a
  simple way to reproduce this failure, send it to this list.
 
  Andi..
 
 
 

 --bcaec53043296dfbfd04a0ece1ac--

[jira] [Commented] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020339#comment-13020339
 ] 

Varun Thacker commented on LUCENE-3018:
---

If we place the cpptasks.jar in the lucene/contrib/misc/lib folder the command 
line for running the build.xml would be :
{noformat}
ant -lib lib/cpptasks.jar build-native
{noformat} 

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-realtime_search-branch - Build # 11 - Failure

Build: 
https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-realtime_search-branch/11/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.ZkSolrClientTest.testReconnect

Error Message:
KeeperErrorCode = ConnectionLoss for /collections

Stack Trace:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /collections
at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:347)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:308)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:290)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:255)
at 
org.apache.solr.cloud.ZkSolrClientTest.testReconnect(ZkSolrClientTest.java:81)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)




Build Log (for compile errors):
[...truncated 8795 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1925) CSV Response Writer

2011-04-15 Thread Sirisha (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020352#comment-13020352
 ] 

Sirisha commented on SOLR-1925:
---

how can we apply the csvresponsewriter to copyField.For example when we have 
phone and altphone(copyfield) when we retreive using csvresponsewriter it 
return as 123-345-3456,126-737-5838 but when want to return to solr using 
this file it stores it in phone field as a record how can parse it 
seperatlylike phone contains 123-345-3456 and altphone(copyfield):126-737-5838 

 CSV Response Writer
 ---

 Key: SOLR-1925
 URL: https://issues.apache.org/jira/browse/SOLR-1925
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
 Environment: indep. of env.
Reporter: Chris A. Mattmann
Assignee: Erik Hatcher
 Fix For: 3.1, 4.0

 Attachments: SOLR-1925.Chheng.071410.patch.txt, 
 SOLR-1925.Mattmann.053010.patch.2.txt, SOLR-1925.Mattmann.053010.patch.3.txt, 
 SOLR-1925.Mattmann.053010.patch.txt, SOLR-1925.Mattmann.061110.patch.txt, 
 SOLR-1925.patch, SOLR-1925.patch, SOLR-1925.patch


 As part of some work I'm doing, I put together a CSV Response Writer. It 
 currently takes all the docs resultant from a query and then outputs their 
 metadata in simple CSV format. The use of a delimeter is configurable (by 
 default if there are multiple values for a particular field they are 
 separated with a | symbol).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020351#comment-13020351
 ] 

Uwe Schindler commented on LUCENE-3018:
---

In general, you should make it possible to not supply a -lib parameter at all. 
E.g. look like this is implemented for clover.jar in common-build.xml (which is 
also a external plugin to ant). The whole thing is to put the classpath into 
the build.xml before trying to use the plugin.

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020354#comment-13020354
 ] 

Uwe Schindler commented on LUCENE-3018:
---

something like that:

{code:xml}
taskdef resource=cpptasks.tasks
  classpath
pathelement location=${lib}/cpptasks.jar/
  /classpath
/taskdef
{code}

 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3018) Lucene Native Directory implementation need automated build

2011-04-15 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020354#comment-13020354
 ] 

Uwe Schindler edited comment on LUCENE-3018 at 4/15/11 5:03 PM:


something like that:

{code:xml}
taskdef resource=cpptasks.tasks
  classpath
pathelement location=${lib}/cpptasks.jar/
  /classpath
/taskdef
{code}

And then use cc/ as task without namespace declaration. This is the first 
example on the ant-contrib page.

  was (Author: thetaphi):
something like that:

{code:xml}
taskdef resource=cpptasks.tasks
  classpath
pathelement location=${lib}/cpptasks.jar/
  /classpath
/taskdef
{code}
  
 Lucene Native Directory implementation need automated build
 ---

 Key: LUCENE-3018
 URL: https://issues.apache.org/jira/browse/LUCENE-3018
 Project: Lucene - Java
  Issue Type: Wish
  Components: Build
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Varun Thacker
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3018.patch, LUCENE-3018.patch, LUCENE-3018.patch, 
 cpptasks-LICENSE-ASL.txt, cpptasks.jar, cpptasks.jar


 Currently the native directory impl in contrib/misc require manual action to 
 compile the c code (partially) documented in 
  
 https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/contrib/misc/src/java/overview.html
 yet it would be nice if we had an ant task and documentation for all 
 platforms how to compile them and set up the prerequisites.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3023) Land DWPT on trunk

[
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020359#comment-13020359
]

Simon Willnauer commented on LUCENE-3023:
-

bq. Attached patch.
awesome mike, I think you should commit that patch and we iterate once we are
back from vacation?
The RT hudson build tolerates //nocommit

bq. I think we lost this infoStream output from trunk?
do you recall where it was?

bq. Can we rename Healthiness - DocumentsWriterStallControl (or something like
that)?
sure go ahead can you rename the test to testStalled instead of testHealthiness

bq. It's looking good!

I think we are reasonably close!

Land DWPT on trunk
--

Attachments: LUCENE-3023.patch, realtime-TestAddIndexes-3.txt,
realtime-TestAddIndexes-5.txt,
realtime-TestIndexWriterExceptions-assert-6.txt,
realtime-TestIndexWriterExceptions-npe-1.txt,
realtime-TestIndexWriterExceptions-npe-2.txt,
realtime-TestIndexWriterExceptions-npe-4.txt,
realtime-TestOmitTf-corrupt-0.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

have a question on solr query

2011-04-15 Thread Ramamurthy, Premila

I have a field DestinationId and it can take values '123 123' or '456'
I need the results of rows which  not have space in the values.


I need the row which has '456' alone to be returned.

Can you help.

Thanks
Premila

Re: An IDF variation with penalty for very rare terms

2011-04-15 Thread eks dev

indeed, frequency usage  is collection and use case dependant...
Not directly your case, but the idea is the same.

We used this information in spell/typo-variations context to
boost/penalize similarity, by dividing terms into a couple of freq
based segments.

Take an example:
Maria - Very High Freq
Marina - Very High Freq
Mraia - Very Low Freq

similarity(Maria, Marina) is by string distance measures very high,
practically the same as (Maria, Mraia) but the likelihood that you
mistyped Mraia is an order of magnitude higher than if you hit VHF-VHF
pair.

Point being, frequency hides a lot of semantics, and how you tune it,
as Martin said, does not really matter, if it works.

We also never found theory that formalize this, but it was logical,
and it worked in practice.

What you said, makes sense to me, especially for very big collections
(or specialized domains with limited vocabulary...) the bigger the
collection, the bigger garbage density in VLF domain (above certain
size of the collection). If  vocabulary in your collection is
somehow limited, there is a size limit where most of new terms (VLF)
are crapterms. One could try to  estimate how saturated a
collection is...


cheers,
eks


On Wed, Apr 13, 2011 at 9:36 PM, Marvin Humphrey mar...@rectangular.com wrote:
 On Wed, Apr 13, 2011 at 01:01:09AM +0400, Earwin Burrfoot wrote:
 Excuse me for somewhat of an offtopic, but have anybody ever seen/used 
 -subj- ?
 Something that looks like like http://dl.dropbox.com/u/920413/IDFplusplus.png
 Traditional log(N/x) tail, but when nearing zero freq, instead of
 going to +inf you do a nice round bump (with controlled
 height/location/sharpness) and drop down to -inf (or zero).

 I haven't used that technique, nor can I quote academic literature blessing
 it.  Nevertheless, what you're doing makes sense makes sense to me.

 Rationale is that - most good, discriminating terms are found in at
 least a certain percentage of your documents, but there are lots of
 mostly unique crapterms, which at some collection sizes stop being
 strictly unique and with IDF's help explode your scores.

 So you've designed a heuristic that allows you to filter a certain kind of
 noise.  It sounds a lot like how people tune length normalization to adapt to
 their document collections.  Many tuning techniques are corpus-specific.
 Whatever works, works!

 Marvin Humphrey


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: have a question on solr query

2011-04-15 Thread Erick Erickson

Your problem statement is kinda sparse on details. Have you looked at
the KeywordAnalyzer?

If you don't see that as relevant, can you provide some more examples
of the kinds of data you expect to put the field and queries that should
and should not match?

Best
Erick

On Tue, Apr 12, 2011 at 11:24 AM, Ramamurthy, Premila 
premila.ramamur...@travelocity.com wrote:

  I have a field DestinationId and it can take values ‘123 123’ or ‘456’

 I need the results of rows which  not have space in the values.





 I need the row which has ‘456’ alone to be returned.



 Can you help.



 Thanks

 Premila

RE: failure of some PyLucene tests on windows OS

2011-04-15 Thread Andi Vajda



On Fri, 15 Apr 2011, Thomas Koch wrote:


I'd expect anyone running on Windows to see these test failures.

Andi..


So what do you think about this issue - can we ignore this or claim it's a
windows bug or hope that 'just' the test code is wrong?


Here is what I think: I think that there is a bug. I don't know where but it 
happens only on Windows. I tried to fix it in the past but I don't run 
Windows much so I didn't try that hard. For a more definitive answer or 
explanation, someone with a stronger itch on Windows is going to have to 
step in. Until then, and it's been years, that bug shall remain a bit of a 
mystery. Given that no one has reported a more general bug in the same area 
of PyLucene indexing on Windows, it could mean that either there are no 
users (unlikely) or that this bug is just with the unit test (more likely).



I'd suggest to at least apply the mentioned fix (i.e. uncomment the close
call in test_PyLucene) to make this test run on windows. Of course someone
should confirm this doesn't break the tests on Linux (or other OS)... just
my thoughts.


I uncommented line 260 in test_PyLucene.py as you suggested and tests seem 
to still pass on Mac OS X. I didn't try Linux or Solaris. In fact, I think 
that this commented line very much looks like an earlier attempt of mine at 
elucidating this bug that I didn't clean up properly.


Checked into rev 1092774 of branch_3x.

Andi..

Re: finding exceptions the crash pylucene

2011-04-15 Thread Bill Janssen

Marcus qwe...@gmail.com wrote:

 Bill:
 I'm not sure I follow.
 why would raising the JVM memory to 4GB ever cause a crash in python?
 Our server has 48GB.

I don't know the specifics of your deployment, but you may not be able
to use that much.  32-bit Python, for instance, won't be able to use it.
Even with 64-bit Python, the OS may place limits on how much memory can
be used by single process.  If the Java VM uses too much, the Python VM
will be choked.

Bill

 
 thanks
 Marcus
 
 On Fri, Apr 15, 2011 at 7:33 AM, Bill Janssen jans...@parc.com wrote:
 
  Marcus qwe...@gmail.com wrote:
 
   --bcaec53043296dfbfd04a0ece1ac
   Content-Type: text/plain; charset=ISO-8859-1
  
   we're currently using 4GB max heap.
   We recently moved from 2GB to 4GB when we discovered it prevented a crash
   with a certain set of docs.
   Marcus
 
  I've tried the same workaround with the heap in the past, and I found it
  caused NoMemory crashes in the Python side of the house, because the
  Python VM couldn't get enough memory to operate.  So, be careful.
 
   On Thu, Apr 14, 2011 at 5:01 PM, Andi Vajda va...@apache.org wrote:
  
   
On Thu, 14 Apr 2011, Marcus wrote:
   
 thanks.
   
I have documents that will consistently cause this upon writing them
  to
the
index. let me see if I can reduce them down to the crux of the crash.
granted, these are docs are very large, unruly bad data, that should
have
never gotten this stage in our pipeline, but I was hoping for a java
  or
lucene exception.
   
I also get Java GC overhead exceptions passed into my code from time
  to
time, but those manageable, and not crashes.
   
Are there known memory constraint scenarios that force a c++
  exception,
whereas in a normal Java environment,  you would get a memory error?
   
   
Not sure.
   
   
 and just confirming, do java.lang.OutOfMemoryError errors pass into
python, or force a crash?
   
   
Not sure, I've never seen these as I make sure I've got enough memory.
initVM() is the place where you can configure the memory for your JVM.
   
Andi..
   
   
   
thanks again
Marcus
   
On Thu, Apr 14, 2011 at 2:07 PM, Andi Vajda va...@apache.org wrote:
   
   
On Thu, 14 Apr 2011, Marcus wrote:
   
 in certain cases when a java/pylucene exception occurs,  it gets
  passed
up
   
in my code, and I'm able to analyze the situation.
sometimes though,  the python process just crashes, and if I happen
  to
be
in
top (linux top that is), I see a JCC exception flash up in the top
console.
where can I go to look for this exception, or is it just lost?
I looked in the locations where a java crash would be located, but
didn't
find anything.
   
   
If you're hitting a crash because of an unhandled C++ exception,
  running
a
debug build with symbols under gdb will help greatly in tracking it
  down.
   
An unhandled C++ exception would be a PyLucene/JCC bug. If you have a
simple way to reproduce this failure, send it to this list.
   
Andi..
   
   
   
  
   --bcaec53043296dfbfd04a0ece1ac--

Re: finding exceptions the crash pylucene

2011-04-15 Thread Marcus

64bit linux(ubuntu), so I don't think there's any practical memory limit on
a python process in that case
I've had python processes up to 8GB personally
Marcus

On Fri, Apr 15, 2011 at 10:54 AM, Bill Janssen jans...@parc.com wrote:

 Marcus qwe...@gmail.com wrote:

  Bill:
  I'm not sure I follow.
  why would raising the JVM memory to 4GB ever cause a crash in python?
  Our server has 48GB.

 I don't know the specifics of your deployment, but you may not be able
 to use that much.  32-bit Python, for instance, won't be able to use it.
 Even with 64-bit Python, the OS may place limits on how much memory can
 be used by single process.  If the Java VM uses too much, the Python VM
 will be choked.

 Bill

 
  thanks
  Marcus
 
  On Fri, Apr 15, 2011 at 7:33 AM, Bill Janssen jans...@parc.com wrote:
 
   Marcus qwe...@gmail.com wrote:
  
--bcaec53043296dfbfd04a0ece1ac
Content-Type: text/plain; charset=ISO-8859-1
   
we're currently using 4GB max heap.
We recently moved from 2GB to 4GB when we discovered it prevented a
 crash
with a certain set of docs.
Marcus
  
   I've tried the same workaround with the heap in the past, and I found
 it
   caused NoMemory crashes in the Python side of the house, because the
   Python VM couldn't get enough memory to operate.  So, be careful.
  
On Thu, Apr 14, 2011 at 5:01 PM, Andi Vajda va...@apache.org
 wrote:
   

 On Thu, 14 Apr 2011, Marcus wrote:

  thanks.

 I have documents that will consistently cause this upon writing
 them
   to
 the
 index. let me see if I can reduce them down to the crux of the
 crash.
 granted, these are docs are very large, unruly bad data, that
 should
 have
 never gotten this stage in our pipeline, but I was hoping for a
 java
   or
 lucene exception.

 I also get Java GC overhead exceptions passed into my code from
 time
   to
 time, but those manageable, and not crashes.

 Are there known memory constraint scenarios that force a c++
   exception,
 whereas in a normal Java environment,  you would get a memory
 error?


 Not sure.


  and just confirming, do java.lang.OutOfMemoryError errors pass
 into
 python, or force a crash?


 Not sure, I've never seen these as I make sure I've got enough
 memory.
 initVM() is the place where you can configure the memory for your
 JVM.

 Andi..



 thanks again
 Marcus

 On Thu, Apr 14, 2011 at 2:07 PM, Andi Vajda va...@apache.org
 wrote:


 On Thu, 14 Apr 2011, Marcus wrote:

  in certain cases when a java/pylucene exception occurs,  it gets
   passed
 up

 in my code, and I'm able to analyze the situation.
 sometimes though,  the python process just crashes, and if I
 happen
   to
 be
 in
 top (linux top that is), I see a JCC exception flash up in the
 top
 console.
 where can I go to look for this exception, or is it just lost?
 I looked in the locations where a java crash would be located,
 but
 didn't
 find anything.


 If you're hitting a crash because of an unhandled C++ exception,
   running
 a
 debug build with symbols under gdb will help greatly in tracking
 it
   down.

 An unhandled C++ exception would be a PyLucene/JCC bug. If you
 have a
 simple way to reproduce this failure, send it to this list.

 Andi..



   
--bcaec53043296dfbfd04a0ece1ac--

[jira] [Created] (SOLR-2469) replication reserves commit-point forever if using replicateAfter=startup

replication reserves commit-point forever if using replicateAfter=startup
-

 Key: SOLR-2469
 URL: https://issues.apache.org/jira/browse/SOLR-2469
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.1, 4.0
Reporter: Yonik Seeley
 Fix For: 3.2, 4.0


If we have replicateAfter startup, the files in the commit point are never 
deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2469) replication reserves commit-point forever if using replicateAfter=startup


 [ 
https://issues.apache.org/jira/browse/SOLR-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reassigned SOLR-2469:
--

Assignee: Yonik Seeley

 replication reserves commit-point forever if using replicateAfter=startup
 -

 Key: SOLR-2469
 URL: https://issues.apache.org/jira/browse/SOLR-2469
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.1, 4.0
Reporter: Yonik Seeley
Assignee: Yonik Seeley
 Fix For: 3.2, 4.0


 If we have replicateAfter startup, the files in the commit point are never 
 deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1092812 [1/2] - in /lucene/dev/trunk/solr/src: java/org/apache/solr/handler/ java/org/apache/solr/response/transform/ java/org/apache/solr/schema/ java/org/apache/solr/search/ java/or

2011-04-15 Thread Yonik Seeley

On Fri, Apr 15, 2011 at 5:13 PM,  yo...@apache.org wrote:
 Author: yonik
 Date: Fri Apr 15 21:13:42 2011
 New Revision: 1092812

 URL: http://svn.apache.org/viewvc?rev=1092812view=rev
 Log:
 SOLR-2469: remove saveCommitPoint for replicate on startup

Oops... this obviously got my work-in-progress from the pseudo-field issue too.
That's ok though - it's all committable stuff - just doesn't have tests yet.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Updated] (LUCENENET-409) Invalid Base exception in DateField.StringToTime()

2011-04-15 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-409:
---

Attachment: DateField.patch

Hi Neal,
Can you try the patch?

DIGY

 Invalid Base exception in DateField.StringToTime()
 --

 Key: LUCENENET-409
 URL: https://issues.apache.org/jira/browse/LUCENENET-409
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4
Reporter: Neal Granroth
 Attachments: DateField.patch


 The Lucene.Net.Documents.DateField.StringToTime() method called by 
 StringToDate() appears to specify an invalid value for the base in the .NET 
 Convert.ToInt64() call.  When a DateField value in a legacy index is read, or 
 Lucene.NET 2.9.4 is used with legacy code that relies upon DateField, the 
 following exception occurs whenever StringToDate() is called:
 System.ArgumentException: Invalid Base.
at System.Convert.ToInt64(String value, Int32 fromBase)
at Lucene.Net.Documents.DateField.StringToTime(String s)
at Lucene.Net.Documents.DateField.StringToDate(String s)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SOLR-1709) Distributed Date Faceting

2011-04-15 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1709:
---

Fix Version/s: (was: 3.2)

I just realized SOLR-1729 is only on trunk, and backporting it to 3x may be 
kind of invasive.

so i'm removing 3.2 from the fix version for now .. we can revisit it, but i'm 
concerned about including this patch as it stands w/o the universal NOW 
concept.

obviously that patch can still be used on 3x w/o the distributed NOW support 
(particularly now that it works with facet.range) but we might want to make it 
more assertive about dealing with shards that return inconsistent ranges

 Distributed Date Faceting
 -

 Key: SOLR-1709
 URL: https://issues.apache.org/jira/browse/SOLR-1709
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 1.4
Reporter: Peter Sturge
Assignee: Hoss Man
Priority: Minor
 Fix For: 4.0

 Attachments: FacetComponent.java, FacetComponent.java, 
 ResponseBuilder.java, SOLR-1709.patch, 
 SOLR-1709_distributed_date_faceting_v3x.patch, solr-1.4.0-solr-1709.patch


 This patch is for adding support for date facets when using distributed 
 searches.
 Date faceting across multiple machines exposes some time-based issues that 
 anyone interested in this behaviour should be aware of:
 Any time and/or time-zone differences are not accounted for in the patch 
 (i.e. merged date facets are at a time-of-day, not necessarily at a universal 
 'instant-in-time', unless all shards are time-synced to the exact same time).
 The implementation uses the first encountered shard's facet_dates as the 
 basis for subsequent shards' data to be merged in.
 This means that if subsequent shards' facet_dates are skewed in relation to 
 the first by 1 'gap', these 'earlier' or 'later' facets will not be merged 
 in.
 There are several reasons for this:
   * Performance: It's faster to check facet_date lists against a single map's 
 data, rather than against each other, particularly if there are many shards
   * If 'earlier' and/or 'later' facet_dates are added in, this will make the 
 time range larger than that which was requested
 (e.g. a request for one hour's worth of facets could bring back 2, 3 
 or more hours of data)
 This could be dealt with if timezone and skew information was added, and 
 the dates were normalized.
 One possibility for adding such support is to [optionally] add 'timezone' and 
 'now' parameters to the 'facet_dates' map. This would tell requesters what 
 time and TZ the remote server thinks it is, and so multiple shards' time data 
 can be normalized.
 The patch affects 2 files in the Solr core:
   org.apache.solr.handler.component.FacetComponent.java
   org.apache.solr.handler.component.ResponseBuilder.java
 The main changes are in FacetComponent - ResponseBuilder is just to hold the 
 completed SimpleOrderedMap until the finishStage.
 One possible enhancement is to perhaps make this an optional parameter, but 
 really, if facet.date parameters are specified, it is assumed they are 
 desired.
 Comments  suggestions welcome.
 As a favour to ask, if anyone could take my 2 source files and create a PATCH 
 file from it, it would be greatly appreciated, as I'm having a bit of trouble 
 with svn (don't shoot me, but my environment is a Redmond-based os company).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2469) replication reserves commit-point forever if using replicateAfter=startup


 [ 
https://issues.apache.org/jira/browse/SOLR-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-2469.


Resolution: Fixed

Fix committed to trunk and 3x.  As noted via email, the only code change 
related to this issue was to ReplicationHandler.java.  The function query 
changes were part of another issue, unrelated to this bug, and accidentally 
committed.

 replication reserves commit-point forever if using replicateAfter=startup
 -

 Key: SOLR-2469
 URL: https://issues.apache.org/jira/browse/SOLR-2469
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 3.1, 4.0
Reporter: Yonik Seeley
Assignee: Yonik Seeley
 Fix For: 3.2, 4.0


 If we have replicateAfter startup, the files in the commit point are never 
 deleted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Commented] (LUCENENET-409) Invalid Base exception in DateField.StringToTime()

2011-04-15 Thread Neal Granroth (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENENET-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020468#comment-13020468
 ] 

Neal Granroth commented on LUCENENET-409:
-

I looked at the patch.  It will clear the exception, but DateField will not 
work correctly.

The StringToDate() method does not apply the reverse of the TicksToMilliseconds 
conversion which the DateToString() method applied.  Also, the StringToDate() 
method does not apply the reverse of the UTC offset which the DateToString() 
method applied.

Would you prefer separate JIRA issues for these?


 Invalid Base exception in DateField.StringToTime()
 --

 Key: LUCENENET-409
 URL: https://issues.apache.org/jira/browse/LUCENENET-409
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Core
Affects Versions: Lucene.Net 2.9.4
Reporter: Neal Granroth
 Attachments: DateField.patch


 The Lucene.Net.Documents.DateField.StringToTime() method called by 
 StringToDate() appears to specify an invalid value for the base in the .NET 
 Convert.ToInt64() call.  When a DateField value in a legacy index is read, or 
 Lucene.NET 2.9.4 is used with legacy code that relies upon DateField, the 
 following exception occurs whenever StringToDate() is called:
 System.ArgumentException: Invalid Base.
at System.Convert.ToInt64(String value, Int32 fromBase)
at Lucene.Net.Documents.DateField.StringToTime(String s)
at Lucene.Net.Documents.DateField.StringToDate(String s)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SOLR-2466) CommonsHttpSolrServer will retry a query even if _maxRetries is 0


 [ 
https://issues.apache.org/jira/browse/SOLR-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2466:
---

Attachment: SOLR-2466.patch

Here's a patch that sets retry to 0 in HttpClient and lets SolrJ to the retry 
based on it's count.

 CommonsHttpSolrServer will retry a query even if _maxRetries is 0
 -

 Key: SOLR-2466
 URL: https://issues.apache.org/jira/browse/SOLR-2466
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: Tomás Fernández Löbbe
 Attachments: SOLR-2466.patch


 The HttpClient library used by CommonsHttpSolrServer will retry by default 3 
 times a request that failed on the server side, even if the _maxRetries field 
 of  CommonsHttpSolrServer is set to 0.
 The retry count should be managed in just one place and CommonsHttpSolrServer 
 seems to be the right one. 
 CommonsHttpSolrServer should override that HttpClient default to 0 retries, 
 and manage the retry count with the value of the field _maxRetries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2470) velocity response writer needs test

velocity response writer needs test
---

 Key: SOLR-2470
 URL: https://issues.apache.org/jira/browse/SOLR-2470
 Project: Solr
  Issue Type: Test
Reporter: Yonik Seeley


/browse was broken w/o anyone realizing... we should have a basic test for it

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2466) CommonsHttpSolrServer will retry a query even if _maxRetries is 0


 [ 
https://issues.apache.org/jira/browse/SOLR-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-2466.


   Resolution: Fixed
Fix Version/s: 3.2

committed to 3x and trunk.

 CommonsHttpSolrServer will retry a query even if _maxRetries is 0
 -

 Key: SOLR-2466
 URL: https://issues.apache.org/jira/browse/SOLR-2466
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: Tomás Fernández Löbbe
 Fix For: 3.2

 Attachments: SOLR-2466.patch


 The HttpClient library used by CommonsHttpSolrServer will retry by default 3 
 times a request that failed on the server side, even if the _maxRetries field 
 of  CommonsHttpSolrServer is set to 0.
 The retry count should be managed in just one place and CommonsHttpSolrServer 
 seems to be the right one. 
 CommonsHttpSolrServer should override that HttpClient default to 0 retries, 
 and manage the retry count with the value of the field _maxRetries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2383) Velocity: Generalize range and date facet display


[ 
https://issues.apache.org/jira/browse/SOLR-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020510#comment-13020510
 ] 

Yonik Seeley commented on SOLR-2383:


Looking good Jan!

I did notice one issue with the example data:
the popularity 3-6 link shows 1 match, but when I click it, I get 6 results.

I believe this is because the faceting is finding (3 = popularity  6) while 
the filter generated by clicking on the link is popularity:[3 TO 6]
So I guess the generated filters should generally be of the form popularity:[3 
TO 6} ?

 Velocity: Generalize range and date facet display
 -

 Key: SOLR-2383
 URL: https://issues.apache.org/jira/browse/SOLR-2383
 Project: Solr
  Issue Type: Bug
  Components: Response Writers
Reporter: Jan Høydahl
  Labels: facet, range, velocity
 Attachments: SOLR-2383.patch, SOLR-2383.patch, SOLR-2383.patch, 
 SOLR-2383.patch


 Velocity (/browse) GUI has hardcoded price range facet and a hardcoded 
 manufacturedate_dt date facet. Need general solution which work for any 
 facet.range and facet.date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 7158 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7158/

All tests passed

Build Log (for compile errors):
[...truncated 4371 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 7159 - Still Failing

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7159/

All tests passed

Build Log (for compile errors):
[...truncated 4361 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2409) edismax unescaped colon returns no results


 [ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2409:
---

Attachment: SOLR-2409.patch

Here's a draft patch that takes the more direct approach of actually checking 
if the field name of a fielded query exists.

 edismax unescaped colon returns no results
 --

 Key: SOLR-2409
 URL: https://issues.apache.org/jira/browse/SOLR-2409
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Ryan McKinley
Assignee: Yonik Seeley
Priority: Minor
 Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch


 The edismax query parser should behave OK when a colon is in the query, but 
 does not refer to a field name.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2409) edismax unescaped colon returns no results


 [ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley reassigned SOLR-2409:
--

Assignee: Yonik Seeley

 edismax unescaped colon returns no results
 --

 Key: SOLR-2409
 URL: https://issues.apache.org/jira/browse/SOLR-2409
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Ryan McKinley
Assignee: Yonik Seeley
Priority: Minor
 Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch


 The edismax query parser should behave OK when a colon is in the query, but 
 does not refer to a field name.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2409) edismax unescaped colon returns no results


[ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020545#comment-13020545
 ] 

Yonik Seeley commented on SOLR-2409:


One problem with the previous approach of checking for a zero length query is 
that it wouldn't handle cases like
I loved Terminator 2: Judgement Day
Because that gets truncated to (Terminator Day) which isn't zero length.

 edismax unescaped colon returns no results
 --

 Key: SOLR-2409
 URL: https://issues.apache.org/jira/browse/SOLR-2409
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Ryan McKinley
Assignee: Yonik Seeley
Priority: Minor
 Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch


 The edismax query parser should behave OK when a colon is in the query, but 
 does not refer to a field name.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2409) edismax unescaped colon returns no results


 [ 
https://issues.apache.org/jira/browse/SOLR-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-2409:
---

Attachment: SOLR-2409.patch

Here's an updated patch with tests (both from Ryan's patch and my own 
additions).

 edismax unescaped colon returns no results
 --

 Key: SOLR-2409
 URL: https://issues.apache.org/jira/browse/SOLR-2409
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Ryan McKinley
Assignee: Yonik Seeley
Priority: Minor
 Attachments: SOLR-2409-unescapedcolon.patch, SOLR-2409.patch, 
 SOLR-2409.patch


 The edismax query parser should behave OK when a colon is in the query, but 
 does not refer to a field name.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-trunk - Build # 1531 - Failure

Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1531/

All tests passed

Build Log (for compile errors):
[...truncated 13002 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 7160 - Still Failing

Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7160/

All tests passed

Build Log (for compile errors):
[...truncated 4363 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Setting the max number of merge threads across IndexWriters

2011-04-15 Thread Shai Erera

Hi

This was raised in LUCENE-2755 (along with other useful refactoring to
MS-IW-MP interaction). Here is the relevant comment which addresses Jason's
particular issue:
https://issues.apache.org/jira/browse/LUCENE-2755?focusedCommentId=12966029page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12966029

In short, we can refactor CMS to not hold to an IndexWriter member if we
change a lot of the API. But IMO, an ExecutorServiceMS is the right way to
go, if you don't mind giving up some CMS features, like controlling thread
priority and stalling running threads. In fact, even w/ ExecutorServiceMS
you can still achieve some (e.g., stalling), but some juggling will be
required.

Then, instead of trying to factor out IW members from this MS, you could
share the same ES with all MS instances, each will keep a reference to a
different IW member. This is just a thought though, I haven't tried it.

Shai

On Thu, Apr 14, 2011 at 8:23 PM, Earwin Burrfoot ear...@gmail.com wrote:

 Can't remember. Probably no. I started an experimental MS api rewrite
 (incorporating ability to share MSs between IWs) some time ago, but
 never had the time to finish it.

 On Thu, Apr 14, 2011 at 19:56, Simon Willnauer
 simon.willna...@googlemail.com wrote:
  On Thu, Apr 14, 2011 at 5:52 PM, Earwin Burrfoot ear...@gmail.com
 wrote:
  I proposed to decouple MergeScheduler from IW (stop keeping a
  reference to it). Then you can create a single CMS and pass it to all
  your IWs.
  Yep that was it... is there an issue for this?
 
  simon
 
  On Thu, Apr 14, 2011 at 19:40, Jason Rutherglen
  jason.rutherg...@gmail.com wrote:
  I think the proposal involved using a ThreadPoolExecutor, which seemed
  to not quite work as well as what we have.  I think it'll be easier to
  simply pass a global context that keeps a counter of the actively
  running threads, and pass that into each IW's CMS?
 
  On Thu, Apr 14, 2011 at 8:25 AM, Simon Willnauer
  simon.willna...@googlemail.com wrote:
  On Thu, Apr 14, 2011 at 5:20 PM, Jason Rutherglen
  jason.rutherg...@gmail.com wrote:
  Today the ConcurrentMergeScheduler allows setting the max thread
  count and is bound to a single IndexWriter.
 
  However in the [common] case of multiple IndexWriters running in
  the same process, this disallows one from managing the aggregate
  number of merge threads executing at any given time.
 
  I think this can be fixed, shall I open an issue?
 
  go ahead! I think I have seen this suggestion somewhere maybe you need
  to see if there is one already
 
  simon
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 
  --
  Kirill Zakharenko/Кирилл Захаренко
  E-Mail/Jabber: ear...@gmail.com
  Phone: +7 (495) 683-567-4
  ICQ: 104465785
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 



 --
 Kirill Zakharenko/Кирилл Захаренко
 E-Mail/Jabber: ear...@gmail.com
 Phone: +7 (495) 683-567-4
 ICQ: 104465785

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[HUDSON] Lucene-Solr-tests-only-trunk - Build # 7161 - Still Failing