[ https://issues.apache.org/jira/browse/OPENNLP-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836892#comment-17836892 ]
Martin Wiesner commented on OPENNLP-1546: ----------------------------------------- [~jzemerick] FYI: I changed the code fragment and external reference to the most recent manual version 2.3.2. > NER training code example in documentation needs updated > -------------------------------------------------------- > > Key: OPENNLP-1546 > URL: https://issues.apache.org/jira/browse/OPENNLP-1546 > Project: OpenNLP > Issue Type: Task > Components: Documentation > Reporter: Jeff Zemerick > Assignee: Cody Fearer > Priority: Major > > The NER training code example needs updated. > [https://opennlp.apache.org/docs/2.3.2/manual/opennlp.html#tools.namefind.training.api] > * The `TokenNameFinderFactory nameFinderFactory` part won't compile. > * This code might be outdated in general. > {code:java} > ObjectStream<String> lineStream = > new PlainTextByLineStream(new > MarkableFileInputStreamFactory(new File("en-ner-person.train")), > StandardCharsets.UTF_8); > TokenNameFinderModel model; > try (ObjectStream<NameSample> sampleStream = new > NameSampleDataStream(lineStream)) { > model = NameFinderME.train("eng", "person", sampleStream, > TrainingParameters.defaultParams(), nameFinderFactory); > } > try (ObjectStream modelOut = new BufferedOutputStream(new > FileOutputStream(modelFile)){ > model.serialize(modelOut); > } > {code} > For reference (but not tested): > {code:java} > final InputStreamFactory in = new > MarkableFileInputStreamFactory(convertedTrainingFile); > final ObjectStream<NameSample> sampleStream = new > NameSampleDataStream(new PlainTextByLineStream(in, StandardCharsets.UTF_8)); > final TokenNameFinderModel nameFinderModel = NameFinderME.train("en", > null, sampleStream, TrainingParameters.defaultParams(), > TokenNameFinderFactory.create(null, null, Collections.emptyMap(), new > BioCodec())); {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)