BLIP
![[논문 리뷰] BLIP (Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation)](https://img1.daumcdn.net/thumb/R750x0/?scode=mtistory2&fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FFjoFI%2FbtsMrYO5MBM%2Fc6781BAdAockrG2kOVfSa0%2Fimg.png)
[논문 리뷰] BLIP (Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation)
[딥러닝 논문 리뷰 시리즈]노션에서 작성한 글을 옮겼으며, 아래 노션에서 더 깔끔하게 읽으실 수 있습니다.>>노션 링크 BLIP (Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation) | NotionReferenceskillful-freighter-f4a.notion.site Abstract & IntroductionBackground, MotivationVision-Language Pre-training(VLP)은 대규모의 Image와 Text pair를 통해, 모델이 여러 Vision-Language task(Image-text Retrieval, Image Captioning..