语音生成式人工智能的风险与治理

朱溯蓉

doi:10.14071/j.1008-8105(2025)-5004

语音生成式人工智能的风险与治理

朱溯蓉

Risks and Governance of Speech-Generated Artificial Intelligence

ZHU Su-rong

摘要

摘要: 以ChatGPT为代表的新一代人工智能的出现，引发了人们关于人工智能风险治理的激烈讨论。当前学术研究主要聚焦于文本生成大模型的风险与规制，却忽略了更具独特性且风险较高的语音生成式人工智能。语音生成的关键环节包括数据分析、声学建模和语音波形生成。通过分析上述关键环节的运行，发现其中存在隐私泄露，侵犯人格权益和技术滥用等风险。欲防止上述风险的发生，应对“治理理念”和“规制措施”两方面进行调整。治理理念方面，应秉持目的限制原则，强调声音信息处理的范围特定性；坚持适度监管以防过度干预，并以“可信赖”原则为核心宗旨，确保整个语音生成式人工智能产品的生命周期可信可靠。规制措施的设置上，在厘清“自然人声音保护”范畴的基础上，明确监管机构和监管职责，建立内部风险评估与风险管理机制，确保相关主体的权利和生成内容的法律保护。

Abstract: The emergence of a new generation of artificial intelligence, represented by ChatGPT, has sparked intense public debate regarding the governance of AI-related risks. Current academic research primarily focuses on the risks and regulatory frameworks of large-scale text generation models, while comparatively overlooking the more distinctive and potentially higher-risk domain of speech generation AI. The core stages of speech generation include data analysis, acoustic modeling, and waveform synthesis. A closer examination of these stages reveals several risk factors, including privacy breaches, violations of personality rights, and potential misuse of technology. To effectively prevent such risks, adjustments must be made at two levels: governance philosophy and regulatory measures. In terms of governance philosophy, the principle of purpose limitation should be upheld, emphasizing the contextual specificity of voice data processing. Moderate regulation should be pursued to prevent excessive intervention, with the overarching goal of fostering “trustworthy” AI—ensuring reliability and accountability throughout the of speech generation systems. Regarding regulatory measures, it is essential to first clarify the legal scope of “natural voice protection”, then delineate the responsibilities and authority of regulatory bodies, and establish internal risk assessment and management mechanisms. This ensures the protection of individuals’ rights and provides a sound legal basis for the use and dissemination of AI-generated speech content.

HTML全文

参考文献(29)

施引文献

资源附件(0)