Strategies to manipulate redundant information for stealthy embedding of malicious content into videos:
- Spatio-Temporal Composition (STC): Malicious content is split across multiple frames. When viewed together, the fragments form a complete harmful message.
-
Dynamic Element Transition: Adversaries manipulate temporal
evolution in
videos to convey implicit intent, which includes:
- Semantic Concept Transition (SCT): Transitions between different semantic concepts in videos can carry malicious intent.
- Visual Style Transition (VST): The aesthetic and atmospheric evolution in videos can also deliver information, and they are rarely specified in user prompts.