[jira] [Commented] (LUCENE-7866) Add TokenFilter to add custom term frequency (like DelimitedPayloadTokenFilter)

Steve Rowe (JIRA) Fri, 09 Jun 2017 16:30:34 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045214#comment-16045214
 ]


Steve Rowe commented on LUCENE-7866:
------------------------------------

My Jenkins found a reproducing {{TestFactories}} failure:

{noformat}
   [junit4] Suite: org.apache.lucene.analysis.core.TestFactories
   [junit4]   2> TEST FAIL: useCharFilter=true text='-q)|.f nvexq '
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestFactories 
-Dtests.method=test -Dtests.seed=CF319AD344836FD4 -Dtests.slow=true 
-Dtests.locale=it -Dtests.timezone=America/Pangnirtung -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] ERROR   27.1s J0 | TestFactories.test <<<
   [junit4]    > Throwable #1: java.lang.NumberFormatException: Unable to parse
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([CF319AD344836FD4:4765A509EA7F022C]:0)
   [junit4]    >        at 
org.apache.lucene.util.ArrayUtil.parse(ArrayUtil.java:94)
   [junit4]    >        at 
org.apache.lucene.util.ArrayUtil.parseInt(ArrayUtil.java:83)
   [junit4]    >        at 
org.apache.lucene.util.ArrayUtil.parseInt(ArrayUtil.java:51)
   [junit4]    >        at 
org.apache.lucene.analysis.miscellaneous.DelimitedTermFrequencyTokenFilter.incrementToken(DelimitedTermFrequencyTokenFilter.java:67)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:731)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:642)
   [junit4]    >        at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:540)
   [junit4]    >        at 
org.apache.lucene.analysis.core.TestFactories.doTestTokenFilter(TestFactories.java:105)
   [junit4]    >        at 
org.apache.lucene.analysis.core.TestFactories.test(TestFactories.java:58)
   [junit4]    >        at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=CheapBastard, 
sim=RandomSimilarity(queryNorm=true): {}, locale=it, 
timezone=America/Pangnirtung
   [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 
1.8.0_77 (64-bit)/cpus=16,threads=1,free=400823216,total=514850816
   [junit4]   2> NOTE: All tests run in this JVM: [TestApostropheFilter, 
TestZeroAffix, TestDutchAnalyzer, TestMorphData, TestCodepointCountFilter, 
TestKeywordTokenizer, TestFactories]
   [junit4] Completed [126/281 (1!)] on J0 in 27.14s, 1 test, 1 error <<< 
FAILURES!
{noformat}

> Add TokenFilter to add custom term frequency (like 
> DelimitedPayloadTokenFilter)
> -------------------------------------------------------------------------------
>
>                 Key: LUCENE-7866
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7866
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: master (7.0)
>
>         Attachments: LUCENE-7866.patch, LUCENE-7866.patch, LUCENE-7866.patch, 
> LUCENE-7866.patch
>
>
> This is a followup of LUCENE-7854. This will add a simple {{TokenFilter}} 
> like {{DelimitedPayloadTokenFilter}} that can be used to index a custom term 
> frequency: {{"token|5"}} will be index token "token" with a term freq of 5. 
> The effect is the same as adding the token 5 times by a "repeat token filter".



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7866) Add TokenFilter to add custom term frequency (like DelimitedPayloadTokenFilter)

Reply via email to