Start of main content
Talk type: Talk
Video embeddings and tasks that are solved with their help in Yandex
The standard approach in machine learning is to pre-train a neural network that projects the object in question (video, picture, text) into a multidimensional vector space, and then, using these representations, solve other tasks (classification, recommendation, ranking, similarity search).
We will talk about a model for constructing a common embedding space for videos and texts, its training and use for different application tasks.