close_btn
로그인, 회원가입후 더 많은 혜택을 누리세요 로그인 회원가입 닫기

Paper Link.

 

https://arxiv.org/abs/1711.02578

 

<Main Reference Paper : Show and Tell : A Neural Image Caption Generator>

https://arxiv.org/abs/1411.4555

 

OCtavio Arriaga, Paul. G. Ploger, Matias Valdenegro

 

Image Captioning -> Anomaly Situations (Dangerous Situations)

 

Alert/Non-Alert Classification Accuracy 97.%

Alert Image Description Accuracy (Meteor score) 16.2

 

 

1. Introduction

 

Classification as an anomaly

1. broken windows

2. injured people

3. fights

4. explosions

5. car accidents

6. fire accidents

7. guns

8. domestic violence

 

 

Use complete sentence than only categorized classification

 

1) New anomaly detection dataset

2) classify and describe a dangerous situation.

 

2. Related Work

 

A. Image Captioning

  1. classify, describe the environment : one of the hardest open problems in ai
    1. computer vision and natural language processing.
    2. DL have overcome this mission[15]
  2. [15] show and tell : lessons learned from the 2015 MSCOCO Image Captioning Challenge
    1. Trainable end-to-end
    2. LogJoingProbability of getting sequence of words(S = {S0,...sN} conditioned on Input Image (I) and parameter(\theta)
    3. Neural Image Caption :
      Encode Image with CNN / Generate sentence with RNN
    4. CNN : GoogLeNet(-last layer) : 2048 nodes
      1. Input img -<CNN>- <LSTM> - only once.
      2.  

B. Anomaly Detection

    - evente that could prove potentially dangerous for humans

 

3. Anomaly Detection Dataset

- 1008 captioned images  corr. to anomalies

- Situations re listed above.

- : Purpose on robot activities such : patrolling or domestic-service

- Creative Commons(CC). 

  - Attribute Original authors, non-commercial agreement.

  - captioned by 20 persons.

 

Rules of captioning images.

  1. siingle English sentence for each image.
  2. present, present continuous tense.(tobe, ~ing)
  3. write primarily about the accident/incident/anomaly in the image.
  4. explicit with the number of persons in the image.
  5. No digits(1,2,3..), written numbers(two, three, four...)
  6. explicit with gender(man, woman)
  7. Length of sentence : 7~18 words recommended

 

4. Model

  1.  [15] + Classification Module
  2. Training Datasets : IAPR-2012([6] 20000 captioned images, copyright free)
    / AD(자체 제작, 위험상황) datatset
  3. Removed
    - alphanumeric
    - tolower
    - less than 3 words
    - longer than 14 words : Anomalous sit. in concise manner.
  4. Vocabulary Size : 1078
  5. 299*299 Image <CNN> 2048 features.
  6. -> {<MLP - binary, 2HLrs> , <LSTM - 512,512 neurons, imgEnbeddings > }
  7. 80/20 train/validation

5. Metrics

 - BLEU

 - METEOR[1]

  1.   word similarity - Exact / Stem/ Synonym
  2.   Order : Crossings
  3.   Chunk : #Broken

 

6.Results

 

Positive : Non-anomaly

Negative : Anomaly

 

 Well Describe the situation in short words

Applicable into the robot patrolling

Soft-Rep <-> Hard evidence.

METEOR after several epochs after Loss Minimum

-> Synomym Matching algorithm. : uniform dist. on synomym word vectors

번호 제목 글쓴이 날짜 조회 수
공지 [공지] 논문 발표 리스트 음파선생 2018.01.26 660
공지 게시판 자료 복구의 건 음파선생 2017.09.11 77
180 [논문반] Gradient Episodic Memory for Continual Learning file 곽대훈 2018.02.13 50
179 [PR12] Audio Super-Resolution using Neural Nets file 음파선생 2018.02.11 106
178 [논문반] Visualizing the Loss Landsacpe of Neural Nets file ilguyi 2018.01.31 104
177 BlitzNet: A Real-Time Deep Network for Scene Understanding file atoz 2018.01.29 95
176 [논문반] BoxCars: Improving Fine-Grained Recognition of Vehicles using 3D Bounding Boxes in Traffic Surveillance file 정자민 2018.01.24 41
175 [논문반]Deep Learning for undersampled MRI reconstruction file Dragon 2018.01.23 32
174 [논문반]Connecting Generative Adversarial Networks and Actor-Critic Methods file 음파선생 2018.01.08 118
173 [논문반] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation file 도니리 2018.01.08 95
172 matrix capsules file 이동훈 2017.12.18 73
171 deep voice 3 file yhlee 2017.12.13 92
170 [논문반] Enhanced Deep Residual Networks for Single Image Super-Resolution file atoz 2017.12.06 110
169 [논문반] Generative Adversarial Networks: An Overview file ilguyi 2017.12.06 658
168 [PR12] Conditional Generative Adversarial Nets file 음파선생 2017.12.03 139
167 [논문반] Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates file 정자민 2017.11.29 60
» [논문반] Image Captioning and Classification of Dangerous Situations June2374 2017.11.22 73
165 [논문반] Deep Learning for Event-Driven Stock Prediction file 양종열 2017.11.20 697
164 [논문반] DR(eye)VE Project: Predicting Drive's Focus of Attenttion file 도니리 2017.11.15 166
163 look, listen and learn file yhlee 2017.11.13 57
162 [논문반]Spectral Graph Convolutions for Population-based Disease Prediction file Dragon 2017.11.07 66