[ 
https://issues.apache.org/jira/browse/LUCENE-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689623#comment-16689623
 ] 

Steve Rowe commented on LUCENE-8517:
------------------------------------

Another reproducing seed, though it only fails for me if I run the whole suite, 
i.e. remove {{-Dtests.method=testRandomChainsWithLargeStrings}} from the 
cmdline - maybe this test method is affected by other methods somehow? From 
[https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/377]:

{noformat}
   [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
   [junit4]   2> Exception from random analyzer: 
   [junit4]   2> charfilters=
   [junit4]   2> tokenizer=
   [junit4]   2>   
org.apache.lucene.analysis.MockTokenizer(org.apache.lucene.util.AttributeFactory$1@9c912349,
 initial state: 0
   [junit4]   2> state 0 [reject]:
   [junit4]   2>  a -> 1
   [junit4]   2>  b -> 2
   [junit4]   2>  f -> 3
   [junit4]   2>  i -> 4
   [junit4]   2>  n -> 5
   [junit4]   2>  o -> 6
   [junit4]   2>  s -> 7
   [junit4]   2>  t -> 8
   [junit4]   2>  w -> 9
   [junit4]   2> state 1 [accept]:
   [junit4]   2>  n -> 10
   [junit4]   2>  r -> 11
   [junit4]   2>  s -> 12
   [junit4]   2>  t -> 13
   [junit4]   2> state 2 [reject]:
   [junit4]   2>  e -> 14
   [junit4]   2>  u -> 15
   [junit4]   2>  y -> 16
   [junit4]   2> state 3 [reject]:
   [junit4]   2>  o -> 17
   [junit4]   2> state 4 [reject]:
   [junit4]   2>  f -> 18
   [junit4]   2>  n -> 19
   [junit4]   2>  s -> 20
   [junit4]   2>  t -> 21
   [junit4]   2> state 5 [reject]:
   [junit4]   2>  o -> 22
   [junit4]   2> state 6 [reject]:
   [junit4]   2>  f -> 23
   [junit4]   2>  n -> 24
   [junit4]   2>  r -> 25
   [junit4]   2> state 7 [reject]:
   [junit4]   2>  u -> 26
   [junit4]   2> state 8 [reject]:
   [junit4]   2>  h -> 27
   [junit4]   2>  o -> 28
   [junit4]   2> state 9 [reject]:
   [junit4]   2>  a -> 29
   [junit4]   2>  i -> 30
   [junit4]   2> state 10 [accept]:
   [junit4]   2>  d -> 31
   [junit4]   2> state 11 [reject]:
   [junit4]   2>  e -> 32
   [junit4]   2> state 12 [accept]:
   [junit4]   2> state 13 [accept]:
   [junit4]   2> state 14 [accept]:
   [junit4]   2> state 15 [reject]:
   [junit4]   2>  t -> 33
   [junit4]   2> state 16 [accept]:
   [junit4]   2> state 17 [reject]:
   [junit4]   2>  r -> 34
   [junit4]   2> state 18 [accept]:
   [junit4]   2> state 19 [accept]:
   [junit4]   2>  t -> 35
   [junit4]   2> state 20 [accept]:
   [junit4]   2> state 21 [accept]:
   [junit4]   2> state 22 [accept]:
   [junit4]   2>  t -> 36
   [junit4]   2> state 23 [accept]:
   [junit4]   2> state 24 [accept]:
   [junit4]   2> state 25 [accept]:
   [junit4]   2> state 26 [reject]:
   [junit4]   2>  c -> 37
   [junit4]   2> state 27 [reject]:
   [junit4]   2>  a -> 38
   [junit4]   2>  e -> 39
   [junit4]   2>  i -> 40
   [junit4]   2> state 28 [accept]:
   [junit4]   2> state 29 [reject]:
   [junit4]   2>  s -> 41
   [junit4]   2> state 30 [reject]:
   [junit4]   2>  l -> 42
   [junit4]   2>  t -> 43
   [junit4]   2> state 31 [accept]:
   [junit4]   2> state 32 [accept]:
   [junit4]   2> state 33 [accept]:
   [junit4]   2> state 34 [accept]:
   [junit4]   2> state 35 [reject]:
   [junit4]   2>  o -> 44
   [junit4]   2> state 36 [accept]:
   [junit4]   2> state 37 [reject]:
   [junit4]   2>  h -> 45
   [junit4]   2> state 38 [reject]:
   [junit4]   2>  t -> 46
   [junit4]   2> state 39 [accept]:
   [junit4]   2>  i -> 47
   [junit4]   2>  n -> 48
   [junit4]   2>  r -> 49
   [junit4]   2>  s -> 50
   [junit4]   2>  y -> 51
   [junit4]   2> state 40 [reject]:
   [junit4]   2>  s -> 52
   [junit4]   2> state 41 [accept]:
   [junit4]   2> state 42 [reject]:
   [junit4]   2>  l -> 53
   [junit4]   2> state 43 [reject]:
   [junit4]   2>  h -> 54
   [junit4]   2> state 44 [accept]:
   [junit4]   2> state 45 [accept]:
   [junit4]   2> state 46 [accept]:
   [junit4]   2> state 47 [reject]:
   [junit4]   2>  r -> 55
   [junit4]   2> state 48 [accept]:
   [junit4]   2> state 49 [reject]:
   [junit4]   2>  e -> 56
   [junit4]   2> state 50 [reject]:
   [junit4]   2>  e -> 57
   [junit4]   2> state 51 [accept]:
   [junit4]   2> state 52 [accept]:
   [junit4]   2> state 53 [accept]:
   [junit4]   2> state 54 [accept]:
   [junit4]   2> state 55 [accept]:
   [junit4]   2> state 56 [accept]:
   [junit4]   2> state 57 [accept]:
   [junit4]   2> , true)
   [junit4]   2> filters=
   [junit4]   2>   
org.apache.lucene.analysis.shingle.ShingleFilter(ValidatingTokenFilter@13de14e 
term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1)
   [junit4]   2>   
Conditional:org.apache.lucene.analysis.shingle.FixedShingleFilter(OneTimeWrapper@2c0047b9
 
term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,
 2)
   [junit4]   2>   
org.apache.lucene.analysis.miscellaneous.DateRecognizerFilter(ValidatingTokenFilter@7846ee89
 
term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1)
   [junit4]   2> NOTE: download the large Jenkins line-docs file by running 
'ant get-jenkins-line-docs' in the lucene directory.
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
-Dtests.method=testRandomChainsWithLargeStrings -Dtests.seed=8021DE70475B4140 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=en-PH -Dtests.timezone=America/Belize -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   14.4s J1 | 
TestRandomChains.testRandomChainsWithLargeStrings <<<
   [junit4]    > Throwable #1: java.lang.IllegalStateException: stage 2: 
inconsistent endOffset at pos=1: 2 vs 3; token=be s s
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([8021DE70475B4140:EA7A61611E1561B3]:0)
   [junit4]    >        at 
org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:125)
   [junit4]    >        at 
org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:49)
   [junit4]    >        at 
org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:68)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkResetException(BaseTokenStreamTestCase.java:441)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:546)
   [junit4]    >        at 
org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:897)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
   [junit4]   2> NOTE: leaving temporary files on disk at: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-7.x/checkout/lucene/build/analysis/common/test/J1/temp/lucene.analysis.core.TestRandomChains_8021DE70475B4140-001
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
{dummy=PostingsFormat(name=LuceneVarGapDocFreqInterval)}, docValues:{}, 
maxPointsInLeafNode=1764, maxMBSortInHeap=6.560614608283627, 
sim=RandomSimilarity(queryNorm=true): {dummy=DFR GLZ(0.3)}, locale=en-PH, 
timezone=America/Belize
   [junit4]   2> NOTE: Linux 4.4.0-137-generic amd64/Oracle Corporation 
1.8.0_191 (64-bit)/cpus=4,threads=1,free=210477784,total=288358400
{noformat}

> TestRandomChains.testRandomChainsWithLargeStrings failure
> ---------------------------------------------------------
>
>                 Key: LUCENE-8517
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8517
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>            Reporter: Steve Rowe
>            Priority: Major
>
> From 
> [https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Linux/2828/consoleText], 
> reproduces for me on Java8:
> {noformat}
> Checking out Revision 216f10026b86627750e133fe24ce6a750c470695 
> (refs/remotes/origin/branch_7x)
> [...]
> [java-info] java version "10.0.1"
> [java-info] OpenJDK Runtime Environment (10.0.1+10, Oracle Corporation)
> [java-info] OpenJDK 64-Bit Server VM (10.0.1+10, Oracle Corporation)
> [java-info] Test args: [-XX:-UseCompressedOops -XX:+UseConcMarkSweepGC]
> [...]
>    [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
>    [junit4]   2> Exception from random analyzer: 
>    [junit4]   2> charfilters=
>    [junit4]   2>   
> org.apache.lucene.analysis.charfilter.MappingCharFilter(org.apache.lucene.analysis.charfilter.NormalizeCharMap@3ef95503,
>  java.io.StringReader@70dde633)
>    [junit4]   2>   
> org.apache.lucene.analysis.fa.PersianCharFilter(org.apache.lucene.analysis.charfilter.MappingCharFilter@12423b20)
>    [junit4]   2> tokenizer=
>    [junit4]   2>   org.apache.lucene.analysis.th.ThaiTokenizer()
>    [junit4]   2> filters=
>    [junit4]   2>   
> org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilter(ValidatingTokenFilter@7914bba7
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,
>  org.apache.lucene.analysis.compound.hyphenation.HyphenationTree@abd7bca)
>    [junit4]   2>   
> Conditional:org.apache.lucene.analysis.MockGraphTokenFilter(java.util.Random@56348091,
>  OneTimeWrapper@aa1c073 
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1)
>    [junit4]   2>   
> Conditional:org.apache.lucene.analysis.shingle.FixedShingleFilter(OneTimeWrapper@4cf58fce
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,
>  4, <NUM>, <SOUTHEAST_ASIAN>)
>    [junit4]   2>   
> org.apache.lucene.analysis.pt.PortugueseLightStemFilter(ValidatingTokenFilter@3a915324
>  
> term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word,termFrequency=1,keyword=false)
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestRandomChains 
> -Dtests.method=testRandomChainsWithLargeStrings -Dtests.seed=92344C536D4E00F4 
> -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=en-ZW 
> -Dtests.timezone=Atlantic/Faroe -Dtests.asserts=true 
> -Dtests.file.encoding=US-ASCII
>    [junit4] ERROR   0.46s J2 | 
> TestRandomChains.testRandomChainsWithLargeStrings <<<
>    [junit4]    > Throwable #1: java.lang.IllegalStateException: stage 3: 
> inconsistent startOffset at pos=0: 0 vs 5; token=effort
>    [junit4]    >      at 
> __randomizedtesting.SeedInfo.seed([92344C536D4E00F4:F86FF34234002007]:0)
>    [junit4]    >      at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:109)
>    [junit4]    >      at 
> org.apache.lucene.analysis.pt.PortugueseLightStemFilter.incrementToken(PortugueseLightStemFilter.java:48)
>    [junit4]    >      at 
> org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:68)
>    [junit4]    >      at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkResetException(BaseTokenStreamTestCase.java:441)
>    [junit4]    >      at 
> org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:546)
>    [junit4]    >      at 
> org.apache.lucene.analysis.core.TestRandomChains.testRandomChainsWithLargeStrings(TestRandomChains.java:897)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>    [junit4]    >      at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    [junit4]    >      at 
> java.base/java.lang.reflect.Method.invoke(Method.java:564)
>    [junit4]    >      at java.base/java.lang.Thread.run(Thread.java:844)
>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene70): 
> {dummy=TestBloomFilteredLucenePostings(BloomFilteringPostingsFormat(Lucene50(blocksize=128)))},
>  docValues:{}, maxPointsInLeafNode=214, maxMBSortInHeap=5.729405811878087, 
> sim=RandomSimilarity(queryNorm=true): {}, locale=en-ZW, 
> timezone=Atlantic/Faroe
>    [junit4]   2> NOTE: Linux 4.15.0-32-generic amd64/Oracle Corporation 
> 10.0.1 (64-bit)/cpus=8,threads=1,free=266844648,total=518979584
>    [junit4]   2> NOTE: All tests run in this JVM: [TestOptionalCondition, 
> TestSerbianNormalizationRegularFilter, TestCommonGramsFilterFactory, 
> TestDoubleEscape, TestDictionaryCompoundWordTokenFilterFactory, 
> TestNorwegianMinimalStemFilter, TestCzechStemmer, 
> TestTurkishLowerCaseFilterFactory, TestAnalyzers, 
> TestScandinavianFoldingFilterFactory, TestReversePathHierarchyTokenizer, 
> TestSimplePatternTokenizer, TestGalicianMinimalStemFilter, MinHashFilterTest, 
> TestPortugueseStemFilterFactory, TestPersianCharFilter, 
> TestPerFieldAnalyzerWrapper, TestRandomChains]
>    [junit4] Completed [73/291 (1!)] on J2 in 3.12s, 2 tests, 1 error <<< 
> FAILURES!
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to