Apache OpenNLP 1.8.0 发布了,OpenNLP 是一个机器学习工具包,用于处理自然语言文本。支持大多数常用的 NLP 任务,例如:标识化、句子切分、部分词性标注、名称抽取、组块、解析等。 此版本带来了许多新功能、改进和错误修复。API 已经得到改进以获得更好的一致性,并且删除了许多不被赞同的方法。 更新如下: POS Tagger context generator now supports feature generation XML Add a Name Finder feature generator that adds POS Tag features Add CONLL-U format support Improve default Name Finder settings TokenNameFinderEvaluator CLI now support nameTypes argument Stupid backoff is now the default in NGramLanguageModel Language codes now are ISO 639-3 compliant Add many unit tests Distribution package now includes example parameters file Now prefix and suffix feature generators are configurable Remove API in Document Categorizer for user specified tokenizer Learnable lemmatizer now returns all possible lemmas for a given word and pos tag Lemmatizer API backward compatibility break: no need to encode/decode lemmas anymore, now LemmatizerME lemmatize method returns the actual lemma Add stemmer, detokenizer and sentence detection abbreviations for Irish Chunker SequenceValidator signature changed to allow access to both token and POS tag 下载地址: https://opennlp.apache.org/download.html >>>【全民狂欢,评论有礼】5月15日-31日评论每日更新的“新闻资讯和软件更新资讯”,评论点赞数超过 20 的可登上每周更新的“源资讯”和“软件周刊”两大栏目,点赞数超过 50 的还将获得 5 活跃积分奖励和开源中国定制好礼。详情 Apache OpenNLP 1.8.0 发布,自然语言处理工具下载地址