近日,我协会成员撰写的论文《Identification of Chinese dark jargons in Telegram underground markets using context-oriented and linguistic features》在CCF-B级期刊(Information Processing & Management)上发表。第一作者为侯伊为(2018级网安专业本科生),合作作者为王海林(2018级网安专业本科生),指导老师为王海舟副教授。
作者提出了一种新型中文黑话检测框架CJI-Framework。首先,作者收集了涉及地下黑市交易的Telegram群组交谈记录,并构建了地下黑市中文语料集TUMCC。TUMCC含3863个语句、10万个字符,是首个开源中文黑话语料集。接着,提取了基于词向量计算/词法分析/词典分析三类角度的七个全新特征来区分黑话和正常应用词汇。最后,基于这些特征,通过异常检测算法识别出潜在黑话。此外,通过词向量映射方法和迁移学习方法进一步增强了框架效果。实验结果表明,该方法有较高的中文黑话识别率,且适配英文后的应用效果依然出众。

期刊简介:
Information Processing & Management是中科院SCI期刊分区计算机科学1区、CCF-B级期刊;主要研究的主题涉及信息系统的设计和计算应用等内容。
官方介绍:Information Processing & Management publishes cutting-edge original research at the intersection of computing and information science concerning theory, methods, or applications in a range of domains, including but not limited to advertising, business, health, information science, information technology marketing, and social computing.
The journal aims to serve the interests of primary researchers but also practitioners in furthering knowledge at the intersection of computing and information science by providing an effective forum for the timely dissemination of advanced and topical issues. The journal is especially interested in original research articles, research survey articles, research method articles, and articles addressing critical applications of research.
论文链接:https://www.sciencedirect.com/science/article/abs/pii/S030645732200142X