swin transformer分类MNIST

安装依赖

  • timm库中提供了swin transformer使用的DropPath层等结构

  • torch库是构建神经网络和实现自动反向传播的基础库

  • sys库提供了一些系统信息和操作的接口

  • logging库提供了日志记录的功能

1
!pip install timm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Collecting timm
Downloading timm-0.6.7-py3-none-any.whl (509 kB)
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510.0/510.0 kB 799.5 kB/s eta 0:00:0000:0100:01
[?25hRequirement already satisfied: torchvision in /opt/conda/lib/python3.7/site-packages (from timm) (0.12.0)
Requirement already satisfied: torch>=1.4 in /opt/conda/lib/python3.7/site-packages (from timm) (1.11.0)
Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.7/site-packages (from torch>=1.4->timm) (4.3.0)
Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from torchvision->timm) (2.28.1)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /opt/conda/lib/python3.7/site-packages (from torchvision->timm) (9.1.1)
Requirement already satisfied: numpy in /opt/conda/lib/python3.7/site-packages (from torchvision->timm) (1.21.6)
Requirement already satisfied: charset-normalizer<3,>=2 in /opt/conda/lib/python3.7/site-packages (from requests->torchvision->timm) (2.1.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->torchvision->timm) (1.26.12)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->torchvision->timm) (2022.6.15.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->torchvision->timm) (3.3)
Installing collected packages: timm
Successfully installed timm-0.6.7
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from torch import nn, optim
import torch.backends.cudnn as cudnn
from timm.data import Mixup
import torch.utils.checkpoint as checkpoint

from timm.loss import LabelSmoothingCrossEntropy, SoftTargetCrossEntropy
from timm.utils import accuracy, AverageMeter
from timm.models.layers import DropPath, to_2tuple, trunc_normal_
from timm.optim.nadam import Nadam
from timm.scheduler.cosine_lr import CosineLRScheduler

import sys
import logging
import functools
from termcolor import colored

from torchvision.datasets import MNIST

创建日志记录的功能,可将输出按照一定格式重定向到文件中,在大型网络调试的时候对于掌握网络信息很有用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
@functools.lru_cache()
def create_logger(output_dir, dist_rank=0, name=''):
# create logger,如果名字有层级结构,则logger也会向对应的父级结构传递日志
logger = logging.getLogger(name)
# 设置记录的日志等级,有CRITICAL、ERROR、WARNING、INFO、DEBUG、NOTSET
logger.setLevel(logging.DEBUG)
# 禁止层级传递
logger.propagate = False

# create formatter
# 格式化输出
fmt = '[%(asctime)s %(name)s] (%(filename)s %(lineno)d): %(levelname)s %(message)s'
color_fmt = colored('[%(asctime)s %(name)s]', 'green') + colored('(%(filename)s %(lineno)d)',
'yellow') + ': %(levelname)s %(message)s'

# create console handlers for master process
if dist_rank == 0:
# 主进程的控制句柄为向标准输出输出信息
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.DEBUG)
console_handler.setFormatter(
logging.Formatter(fmt=color_fmt, datefmt='%Y-%m-%d %H:%M:%S'))
logger.addHandler(console_handler)

# create file handlers
# 文件句柄,向文件进行输出日志信息
file_handler = logging.FileHandler(os.path.join(output_dir, f'log_rank{dist_rank}.txt'), mode='a')
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(logging.Formatter(fmt=fmt, datefmt='%Y-%m-%d %H:%M:%S'))
logger.addHandler(file_handler)

return logger

swin transformer基本结构:

包括一个用于将图像打成patch的Patch Partition层、对patch后图像channel数目进行变换的Linear Embedding层,以及4个Basic Block用于层级transformer变换。数据流向为:

  • Patch Partition和Linear Embedding层在代码中合并为Patch Embed层,用于将图片打成patch和变换channels数目。输入图像从(B,H,W,3)变换至(B,H//num_patch, W//num_patch, num_patchnum_patch3),再变换至(B,H//num_patch, W//num_patch, 96),代码实现是通过步长为num_patch的卷积操作一次性实现两个操作。此时形状记为(B, H0, W0, C),注意该形状为代码实现中的写法,和图中表示有所不同。

  • BasicBlock层包括多个Swin Transformer Block和一个Patch Merging层。Swin Transformer网络一共包含4个Basic Block,每层具有的Swin Transformer Block数目为[2,2,6,2],每个Basic Block层中仅第一个Swin Transformer不需要实现shifted window操作。

    1. Swin Transformer Block中首先通过window_partition操作将图片划分为H0//window_size * W0//window_size个(B, window_size, window_size, C)形状的小窗口,并将其组织成(H0//window_size * W0//window_size * B, window_size * window_size, C)的形状,记为(nWB, N, C),其中nW=H0//window_sizeW0//window_size为划分的小窗口数量,然后执行多头注意力WindowsAttention操作。

    2. WindowsAttention中先对矩阵通过全连接层生成形状均为的(nWB, N, C)的三个矩阵Q,K,V,然后按照多头注意力的数目将其形状变为(nWB, num_heads, N, C//num_heads),将其视为nWBnum_heads个维度为(N,C//num_heads)的矩阵,通过 attn=Q K^\top 计算得到形状为(nWB, num_heads, N, N)的注意力矩阵,然后将通过attn和V矩阵相乘得到形状为(nWB, num_heads, N, C//num_heads)的矩阵,最后将其组织为(nW*B, N, C)形状的矩阵进行输出。因此WindowsAttention输入和输出的维度相同,可以任意堆叠多层。

    3. PatchMerging层之前,首先将多层WindowsAttention堆叠输出得到的(nWB, N, C)矩阵先按照小窗口的划分方式将小窗口合并为(B, NnW, C),用输入图片的形状表示即为(B, H0W0, C),然后将其变换为Swin Transformer Block层最初输入的图片形状(B, H0, W0, C)。PatchMerging层中通过间隔采样得到4个形状为(B, H0//2, W0//2, C)的矩阵,拼接得到的(B, H0//2, W0//2, C4)再送到全连接降维成(B, H0//2, W0//2, C2)的矩阵。该层输入为(B, H0, W0, C),输出为(B, H0//2, W0//2, C2)从而实现下采样的目的。

  • 由于最后一层Basic Block没有下采样层,因此经过4层Basic Block得到的输出为(B, H0//2^3, W0//2^3, C2^3),经过AvgPool转化为(B, 1, C2^3)的形式,然后送入全连接进行分类。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
class Mlp(nn.Module):
def __init__(self, in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.):
super().__init__()
out_features = out_features or in_features
hidden_features = hidden_features or in_features
self.fc1 = nn.Linear(in_features, hidden_features)
self.act = act_layer()
self.fc2 = nn.Linear(hidden_features, out_features)
self.drop = nn.Dropout(drop)

def forward(self, x):
x = self.fc1(x)
x = self.act(x)
x = self.drop(x)
x = self.fc2(x)
x = self.drop(x)
return x


def window_partition(x, window_size):
"""
Args:
x: (B, H, W, C)
window_size (int): window size

Returns:
windows: (num_windows*B, window_size, window_size, C)
"""
B, H, W, C = x.shape
x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
windows = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(-1, window_size, window_size, C)
return windows


def window_reverse(windows, window_size, H, W):
"""
Args:
windows: (num_windows*B, window_size, window_size, C)
window_size (int): Window size
H (int): Height of image
W (int): Width of image

Returns:
x: (B, H, W, C)
"""
# 将nW*B个窗口合并为一个
# B = B*nW/nW = (windows.shape[0]) / (H*W/window_size/window_size)
B = int(windows.shape[0] / (H * W / window_size / window_size))
x = windows.view(B, H // window_size, W // window_size, window_size, window_size, -1)
x = x.permute(0, 1, 3, 2, 4, 5).contiguous().view(B, H, W, -1)
return x


class WindowAttention(nn.Module):
r""" Window based multi-head self attention (W-MSA) module with relative position bias.
It supports both of shifted and non-shifted window.

Args:
dim (int): Number of input channels.
window_size (tuple[int]): The height and width of the window.
num_heads (int): Number of attention heads.
qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set
attn_drop (float, optional): Dropout ratio of attention weight. Default: 0.0
proj_drop (float, optional): Dropout ratio of output. Default: 0.0
"""

def __init__(self, dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0., proj_drop=0.):

super().__init__()
self.dim = dim
self.window_size = window_size # Wh, Ww
self.num_heads = num_heads
head_dim = dim // num_heads
self.scale = qk_scale or head_dim ** -0.5

# define a parameter table of relative position bias
self.relative_position_bias_table = nn.Parameter(
torch.zeros((2 * window_size[0] - 1) * (2 * window_size[1] - 1), num_heads)) # 2*Wh-1 * 2*Ww-1, nH

# get pair-wise relative position index for each token inside the window
coords_h = torch.arange(self.window_size[0])
coords_w = torch.arange(self.window_size[1])
# 避免报错信息
coords = torch.stack(torch.meshgrid([coords_h, coords_w], indexing='ij')) # 2, Wh, Ww
coords_flatten = torch.flatten(coords, 1) # 2, Wh*Ww
relative_coords = coords_flatten[:, :, None] - coords_flatten[:, None, :] # 2, Wh*Ww, Wh*Ww
relative_coords = relative_coords.permute(1, 2, 0).contiguous() # Wh*Ww, Wh*Ww, 2
relative_coords[:, :, 0] += self.window_size[0] - 1 # shift to start from 0
relative_coords[:, :, 1] += self.window_size[1] - 1
relative_coords[:, :, 0] *= 2 * self.window_size[1] - 1
relative_position_index = relative_coords.sum(-1) # Wh*Ww, Wh*Ww
# 注册为不可变参数,在保存模型时该参数也会被保存
self.register_buffer("relative_position_index", relative_position_index)
# 96 -> 96*3
self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
self.attn_drop = nn.Dropout(attn_drop)
self.proj = nn.Linear(dim, dim)
self.proj_drop = nn.Dropout(proj_drop)
# 截断到(\mu-3\sigma,\mu+3\sigma)之内的正态分布
trunc_normal_(self.relative_position_bias_table, std=.02)
self.softmax = nn.Softmax(dim=-1)

def forward(self, x, mask=None):
"""
Args:
x: input features with shape of (num_windows*B, N, C)
mask: (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or None
"""
B_, N, C = x.shape
qkv = self.qkv(x).reshape(B_, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
# (3, B_, num_heads, N, C//num_heads)
q, k, v = qkv[0], qkv[1], qkv[2] # make torchscript happy (cannot use tensor as tuple)

# 注意力机制
q = q * self.scale
attn = (q @ k.transpose(-2, -1))
# (B_, num_heads, N, N)

relative_position_bias = self.relative_position_bias_table[self.relative_position_index.view(-1)].view(
self.window_size[0] * self.window_size[1], self.window_size[0] * self.window_size[1], -1) # Wh*Ww,Wh*Ww,nH
relative_position_bias = relative_position_bias.permute(2, 0, 1).contiguous() # nH, Wh*Ww, Wh*Ww
attn = attn + relative_position_bias.unsqueeze(0)

if mask is not None:
nW = mask.shape[0]
# mask用于屏蔽不应connect的注意力矩阵部分
attn = attn.view(B_ // nW, nW, self.num_heads, N, N) + mask.unsqueeze(1).unsqueeze(0)
attn = attn.view(-1, self.num_heads, N, N)
attn = self.softmax(attn)
else:
attn = self.softmax(attn)
# 对注意力图进行dropout操作
attn = self.attn_drop(attn)

x = (attn @ v).transpose(1, 2).reshape(B_, N, C)
x = self.proj(x)
x = self.proj_drop(x)
return x

def extra_repr(self) -> str:
return f'dim={self.dim}, window_size={self.window_size}, num_heads={self.num_heads}'

def flops(self, N):
# 计算复杂度,并不是每秒的浮点运算数
# calculate flops for 1 window with token length of N
flops = 0
# qkv = self.qkv(x)
# N是小块拉直的数目(token长度),dim是每个序列的长度
# (N,dim)(dim,dim*3)->(N,3)计算复杂度为(N*dim*dim*3)
flops += N * self.dim * 3 * self.dim
# attn = (q @ k.transpose(-2, -1))
# 先拆分为多头注意力,计算复杂度为num_heads*N*(dim//N)*N
# (num_heads, N, dim//num_heads) (num_heads, N, dim//num_heads).T -> (num_heads, N, N)
flops += self.num_heads * N * (self.dim // self.num_heads) * N
# x = (attn @ v),计算复杂度为num_heads*N^2*dim//num_heads
# (num_heads, N, N)(num_heads, N, dim//num_heads) -> (num_heads, N, dim//num_heads)
flops += self.num_heads * N * N * (self.dim // self.num_heads)
# x = self.proj(x),计算复杂度为(N, dim, dim)
# (N, dim) -> (N, dim)
flops += N * self.dim * self.dim
return flops


class SwinTransformerBlock(nn.Module):
r""" Swin Transformer Block.

Args:
dim (int): Number of input channels.
input_resolution (tuple[int]): Input resulotion.
num_heads (int): Number of attention heads.
window_size (int): Window size.
shift_size (int): Shift size for SW-MSA.
mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.
qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.
drop (float, optional): Dropout rate. Default: 0.0
attn_drop (float, optional): Attention dropout rate. Default: 0.0
drop_path (float, optional): Stochastic depth rate. Default: 0.0
act_layer (nn.Module, optional): Activation layer. Default: nn.GELU
norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNorm
fused_window_process (bool, optional): If True, use one kernel to fused window shift & window partition for acceleration, similar for the reversed part. Default: False
"""

def __init__(self, dim, input_resolution, num_heads, window_size=7, shift_size=0,
mlp_ratio=4., qkv_bias=True, qk_scale=None, drop=0., attn_drop=0., drop_path=0.,
act_layer=nn.GELU, norm_layer=nn.LayerNorm,
fused_window_process=False):
super().__init__()
self.dim = dim
self.input_resolution = input_resolution
self.num_heads = num_heads
self.window_size = window_size
self.shift_size = shift_size
self.mlp_ratio = mlp_ratio
# 在小范围内做自注意力
if min(self.input_resolution) <= self.window_size:
# if window size is larger than input resolution, we don't partition windows
self.shift_size = 0
self.window_size = min(self.input_resolution)
assert 0 <= self.shift_size < self.window_size, "shift_size must in 0-window_size"

self.norm1 = norm_layer(dim)
self.attn = WindowAttention(
dim, window_size=to_2tuple(self.window_size), num_heads=num_heads,
qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop)
# 一种正则化手段,再batch维度随机设置一定样本不进行主干而直接由分支进行恒等映射
self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()
self.norm2 = norm_layer(dim)
mlp_hidden_dim = int(dim * mlp_ratio)
self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop)

if self.shift_size > 0:
# calculate attention mask for SW-MSA
H, W = self.input_resolution
img_mask = torch.zeros((1, H, W, 1)) # 1 H W 1
h_slices = (slice(0, -self.window_size),
slice(-self.window_size, -self.shift_size),
slice(-self.shift_size, None))
w_slices = (slice(0, -self.window_size),
slice(-self.window_size, -self.shift_size),
slice(-self.shift_size, None))
# 给每块区域划分标记序号,每一个维度分为0:-7,-7:-2,-2:三部分
cnt = 0
for h in h_slices:
for w in w_slices:
img_mask[:, h, w, :] = cnt
cnt += 1
# 生成掩码窗口,64, 7, 7, 1
mask_windows = window_partition(img_mask, self.window_size) # nW, window_size, window_size, 1
# mask_windows (64, 49)
mask_windows = mask_windows.view(-1, self.window_size * self.window_size)
attn_mask = mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2)
attn_mask = attn_mask.masked_fill(attn_mask != 0, float(-100.0)).masked_fill(attn_mask == 0, float(0.0))
else:
attn_mask = None
# 获得掩码矩阵,具体看论文中所说
self.register_buffer("attn_mask", attn_mask)
self.fused_window_process = fused_window_process

def forward(self, x):
H, W = self.input_resolution
B, L, C = x.shape
assert L == H * W, "input feature has wrong size"

shortcut = x
# 标准化层
x = self.norm1(x)
x = x.view(B, H, W, C)

# cyclic shift
if self.shift_size > 0:
shifted_x = torch.roll(x, shifts=(-self.shift_size, -self.shift_size), dims=(1, 2))
# partition windows
x_windows = window_partition(shifted_x, self.window_size) # nW*B, window_size, window_size, C
else:
shifted_x = x
# partition windows,nW为窗口数目
x_windows = window_partition(shifted_x, self.window_size) # nW*B, window_size, window_size, C
# 窗口数目*每个窗口的token长度,通道数目
x_windows = x_windows.view(-1, self.window_size * self.window_size, C) # nW*B, window_size*window_size, C

# W-MSA/SW-MSA,多个窗口可以并行地做注意力
attn_windows = self.attn(x_windows, mask=self.attn_mask) # nW*B, window_size*window_size, C

# merge windows
attn_windows = attn_windows.view(-1, self.window_size, self.window_size, C)

# reverse cyclic shift
if self.shift_size > 0:
# 将所有小窗口合并为一个大窗口
shifted_x = window_reverse(attn_windows, self.window_size, H, W) # B H' W' C
# 对整个窗口进行左上的平移操作
x = torch.roll(shifted_x, shifts=(self.shift_size, self.shift_size), dims=(1, 2))
else:
shifted_x = window_reverse(attn_windows, self.window_size, H, W) # B H' W' C
x = shifted_x
x = x.view(B, H * W, C)
x = shortcut + self.drop_path(x)

# FFN
x = x + self.drop_path(self.mlp(self.norm2(x)))

return x


class PatchMerging(nn.Module):
r""" Patch Merging Layer.

Args:
input_resolution (tuple[int]): Resolution of input feature.
dim (int): Number of input channels.
norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNorm
"""

def __init__(self, input_resolution, dim, norm_layer=nn.LayerNorm):
super().__init__()
# 下采样,每行、每列间隔采样,通道数扩增到4倍,然后再将回到2倍
self.input_resolution = input_resolution
self.dim = dim
self.reduction = nn.Linear(4 * dim, 2 * dim, bias=False)
self.norm = norm_layer(4 * dim)

def forward(self, x):
"""
x: B, H*W, C
"""
H, W = self.input_resolution
B, L, C = x.shape
assert L == H * W, "input feature has wrong size"
assert H % 2 == 0 and W % 2 == 0, f"x size ({H}*{W}) are not even."

x = x.view(B, H, W, C)

x0 = x[:, 0::2, 0::2, :] # B H/2 W/2 C
x1 = x[:, 1::2, 0::2, :] # B H/2 W/2 C
x2 = x[:, 0::2, 1::2, :] # B H/2 W/2 C
x3 = x[:, 1::2, 1::2, :] # B H/2 W/2 C
x = torch.cat([x0, x1, x2, x3], -1) # B H/2 W/2 4*C
x = x.view(B, -1, 4 * C) # B H/2*W/2 4*C

x = self.norm(x)
x = self.reduction(x)

return x


class BasicLayer(nn.Module):
""" A basic Swin Transformer layer for one stage.

Args:
dim (int): Number of input channels.
input_resolution (tuple[int]): Input resolution.
depth (int): Number of blocks.
num_heads (int): Number of attention heads.
window_size (int): Local window size.
mlp_ratio (float): Ratio of mlp hidden dim to embedding dim.
qkv_bias (bool, optional): If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional): Override default qk scale of head_dim ** -0.5 if set.
drop (float, optional): Dropout rate. Default: 0.0
attn_drop (float, optional): Attention dropout rate. Default: 0.0
drop_path (float | tuple[float], optional): Stochastic depth rate. Default: 0.0
norm_layer (nn.Module, optional): Normalization layer. Default: nn.LayerNorm
downsample (nn.Module | None, optional): Downsample layer at the end of the layer. Default: None
use_checkpoint (bool): Whether to use checkpointing to save memory. Default: False.
fused_window_process (bool, optional): If True, use one kernel to fused window shift & window partition for acceleration, similar for the reversed part. Default: False
"""

def __init__(self, dim, input_resolution, depth, num_heads, window_size,
mlp_ratio=4., qkv_bias=True, qk_scale=None, drop=0., attn_drop=0.,
drop_path=0., norm_layer=nn.LayerNorm, downsample=None, use_checkpoint=False,
fused_window_process=False):

super().__init__()
self.dim = dim
self.input_resolution = input_resolution
self.depth = depth
self.use_checkpoint = use_checkpoint

# build blocks
self.blocks = nn.ModuleList([
SwinTransformerBlock(dim=dim, input_resolution=input_resolution,
num_heads=num_heads, window_size=window_size,
shift_size=1,
mlp_ratio=mlp_ratio,
qkv_bias=qkv_bias, qk_scale=qk_scale,
drop=drop, attn_drop=attn_drop,
drop_path=drop_path[i] if isinstance(drop_path, list) else drop_path,
norm_layer=norm_layer,
fused_window_process=fused_window_process)
for i in range(depth)])

# patch merging layer
if downsample is not None:
self.downsample = downsample(input_resolution, dim=dim, norm_layer=norm_layer)
else:
self.downsample = None

def forward(self, x):
for blk in self.blocks:
if self.use_checkpoint:
x = checkpoint.checkpoint(blk, x)
else:
x = blk(x)
if self.downsample is not None:
x = self.downsample(x)
return x


class PatchEmbed(nn.Module):
r""" Image to Patch Embedding

Args:
img_size (int): Image size. Default: 224.
patch_size (int): Patch token size. Default: 4.
in_chans (int): Number of input image channels. Default: 3.
embed_dim (int): Number of linear projection output channels. Default: 96.
norm_layer (nn.Module, optional): Normalization layer. Default: None
"""

def __init__(self, img_size=224, patch_size=4, in_chans=3, embed_dim=96, norm_layer=None):
super().__init__()
# 224 -> (224, 224)
img_size = to_2tuple(img_size)
# 4 -> (4, 4)
patch_size = to_2tuple(patch_size)
# 切割成[56,56]个块
patches_resolution = [img_size[0] // patch_size[0], img_size[1] // patch_size[1]]
self.img_size = img_size
self.patch_size = patch_size
self.patches_resolution = patches_resolution
self.num_patches = patches_resolution[0] * patches_resolution[1]
# 输入通道数目为3
self.in_chans = in_chans
# 嵌入维度为96
self.embed_dim = embed_dim
# (b,3,224,224) -> (b,96,224//4,224//4)通道数变化
self.proj = nn.Conv2d(in_chans, embed_dim, kernel_size=patch_size, stride=patch_size)
# 层归一化,常用于nlp,因此可能来源于transformer自带的
if norm_layer is not None:
self.norm = norm_layer(embed_dim)
else:
self.norm = None

def forward(self, x):
B, C, H, W = x.shape
# FIXME look at relaxing size constraints
assert H == self.img_size[0] and W == self.img_size[1], \
f"Input image size ({H}*{W}) doesn't match model ({self.img_size[0]}*{self.img_size[1]})."

x = self.proj(x).flatten(2).transpose(1, 2) # B Ph*Pw C
if self.norm is not None:
x = self.norm(x)
return x


class SwinTransformer(nn.Module):
r""" Swin Transformer
A PyTorch impl of : `Swin Transformer: Hierarchical Vision Transformer using Shifted Windows` -
https://arxiv.org/pdf/2103.14030

Args:
img_size (int | tuple(int)): Input image size. Default 224
patch_size (int | tuple(int)): Patch size. Default: 4
in_chans (int): Number of input image channels. Default: 3
num_classes (int): Number of classes for classification head. Default: 1000
embed_dim (int): Patch embedding dimension. Default: 96
depths (tuple(int)): Depth of each Swin Transformer layer.
num_heads (tuple(int)): Number of attention heads in different layers.
window_size (int): Window size. Default: 7
mlp_ratio (float): Ratio of mlp hidden dim to embedding dim. Default: 4
qkv_bias (bool): If True, add a learnable bias to query, key, value. Default: True
qk_scale (float): Override default qk scale of head_dim ** -0.5 if set. Default: None
drop_rate (float): Dropout rate. Default: 0
attn_drop_rate (float): Attention dropout rate. Default: 0
drop_path_rate (float): Stochastic depth rate. Default: 0.1
norm_layer (nn.Module): Normalization layer. Default: nn.LayerNorm.
ape (bool): If True, add absolute position embedding to the patch embedding. Default: False
patch_norm (bool): If True, add normalization after patch embedding. Default: True
use_checkpoint (bool): Whether to use checkpointing to save memory. Default: False
fused_window_process (bool, optional): If True, use one kernel to fused window shift & window partition for acceleration, similar for the reversed part. Default: False
"""

def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes=1000,
embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24],
window_size=7, mlp_ratio=4., qkv_bias=True, qk_scale=None,
drop_rate=0., attn_drop_rate=0., drop_path_rate=0.1,
norm_layer=nn.LayerNorm, ape=False, patch_norm=True,
use_checkpoint=False, fused_window_process=False, **kwargs):
super().__init__()

self.num_classes = num_classes
self.num_layers = len(depths) # depths每一个swin transformer层的深度,因为swin transformer可以随意堆叠
self.embed_dim = embed_dim
self.ape = ape # 绝对位置编码
self.patch_norm = patch_norm # patch归一化
self.num_features = int(embed_dim * 2 ** (self.num_layers - 1)) # 经过所有层后的通道数量
self.mlp_ratio = mlp_ratio # mlp隐藏层维度与嵌入维度的商

# split image into non-overlapping patches
# 对图片进行切割的块
self.patch_embed = PatchEmbed(
img_size=img_size, patch_size=patch_size, in_chans=in_chans, embed_dim=embed_dim,
norm_layer=norm_layer if self.patch_norm else None)
num_patches = self.patch_embed.num_patches
patches_resolution = self.patch_embed.patches_resolution
self.patches_resolution = patches_resolution

# absolute position embedding
if self.ape:
self.absolute_pos_embed = nn.Parameter(torch.zeros(1, num_patches, embed_dim))
trunc_normal_(self.absolute_pos_embed, std=.02)

self.pos_drop = nn.Dropout(p=drop_rate)

# stochastic depth
# swin transformer层数一共为12,其中drop_path_rate=0.2,每一层的dropout概率从0增加到0.2
dpr = [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))] # stochastic depth decay rule

# build layers
self.layers = nn.ModuleList()
# 每经过一层BasicLayer,图片宽和高减半,channels加倍,96->96*2->96*4
# 初始的图片为(b, 96, 3136)
for i_layer in range(self.num_layers):
layer = BasicLayer(dim=int(embed_dim * 2 ** i_layer),
input_resolution=(patches_resolution[0] // (2 ** i_layer),
patches_resolution[1] // (2 ** i_layer)),
depth=depths[i_layer], # [2, 2, 6, 2]
num_heads=num_heads[i_layer],
window_size=window_size, # 窗口尺寸是7
mlp_ratio=self.mlp_ratio,
qkv_bias=qkv_bias, qk_scale=qk_scale,
drop=drop_rate, attn_drop=attn_drop_rate,
drop_path=dpr[sum(depths[:i_layer]):sum(depths[:i_layer + 1])],
norm_layer=norm_layer,
downsample=PatchMerging if (i_layer < self.num_layers - 1) else None,
use_checkpoint=use_checkpoint,
fused_window_process=fused_window_process)
self.layers.append(layer)

self.norm = norm_layer(self.num_features)
self.avgpool = nn.AdaptiveAvgPool1d(1)
self.head = nn.Linear(self.num_features, num_classes) if num_classes > 0 else nn.Identity()

self.apply(self._init_weights)

def _init_weights(self, m):
if isinstance(m, nn.Linear):
trunc_normal_(m.weight, std=.02)
if isinstance(m, nn.Linear) and m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.LayerNorm):
nn.init.constant_(m.bias, 0)
nn.init.constant_(m.weight, 1.0)

def forward_features(self, x):
x = self.patch_embed(x)
if self.ape:
x = x + self.absolute_pos_embed
x = self.pos_drop(x)

for layer in self.layers:
x = layer(x)

x = self.norm(x) # B L C
x = self.avgpool(x.transpose(1, 2)) # B C 1
x = torch.flatten(x, 1)
return x

def forward(self, x):
x = self.forward_features(x)
x = self.head(x)
return x

pytorch官方训练参考中给出的相关代码

图片预处理相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
from torchvision.transforms import autoaugment, transforms
from torchvision.transforms.functional import InterpolationMode

class ClassificationPresetTrain:
def __init__(
self,
*,
crop_size,
mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
interpolation=InterpolationMode.BILINEAR,
hflip_prob=0.5,
auto_augment_policy=None,
ra_magnitude=9,
augmix_severity=3,
random_erase_prob=0.0,
):
trans = [transforms.RandomResizedCrop(crop_size, interpolation=interpolation)]
if hflip_prob > 0:
trans.append(transforms.RandomHorizontalFlip(hflip_prob))
if auto_augment_policy is not None:
if auto_augment_policy == "ra":
trans.append(autoaugment.RandAugment(interpolation=interpolation, magnitude=ra_magnitude))
elif auto_augment_policy == "ta_wide":
trans.append(autoaugment.TrivialAugmentWide(interpolation=interpolation))
elif auto_augment_policy == "augmix":
trans.append(autoaugment.AugMix(interpolation=interpolation, severity=augmix_severity))
else:
aa_policy = autoaugment.AutoAugmentPolicy(auto_augment_policy)
trans.append(autoaugment.AutoAugment(policy=aa_policy, interpolation=interpolation))
trans.extend(
[
transforms.PILToTensor(),
transforms.ConvertImageDtype(torch.float),
transforms.Normalize(mean=mean, std=std),
]
)
if random_erase_prob > 0:
trans.append(transforms.RandomErasing(p=random_erase_prob))

self.transforms = transforms.Compose(trans)

def __call__(self, img):
return self.transforms(img)


class ClassificationPresetEval:
def __init__(
self,
*,
crop_size,
resize_size=256,
mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
interpolation=InterpolationMode.BILINEAR,
):

self.transforms = transforms.Compose(
[
transforms.Resize(resize_size, interpolation=interpolation),
transforms.CenterCrop(crop_size),
transforms.PILToTensor(),
transforms.ConvertImageDtype(torch.float),
transforms.Normalize(mean=mean, std=std),
]
)

def __call__(self, img):
return self.transforms(img)

分布式数据采样

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import torch
import torch.distributed as dist


class RASampler(torch.utils.data.Sampler):
"""Sampler that restricts data loading to a subset of the dataset for distributed,
with repeated augmentation.
It ensures that different each augmented version of a sample will be visible to a
different process (GPU).
Heavily based on 'torch.utils.data.DistributedSampler'.

This is borrowed from the DeiT Repo:
https://github.com/facebookresearch/deit/blob/main/samplers.py
"""

def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True, seed=0, repetitions=3):
if num_replicas is None:
if not dist.is_available():
raise RuntimeError("Requires distributed package to be available!")
num_replicas = dist.get_world_size()
if rank is None:
if not dist.is_available():
raise RuntimeError("Requires distributed package to be available!")
rank = dist.get_rank()
self.dataset = dataset
self.num_replicas = num_replicas
self.rank = rank
self.epoch = 0
self.num_samples = int(math.ceil(len(self.dataset) * float(repetitions) / self.num_replicas))
self.total_size = self.num_samples * self.num_replicas
self.num_selected_samples = int(math.floor(len(self.dataset) // 256 * 256 / self.num_replicas))
self.shuffle = shuffle
self.seed = seed
self.repetitions = repetitions

def __iter__(self):
if self.shuffle:
# Deterministically shuffle based on epoch
g = torch.Generator()
g.manual_seed(self.seed + self.epoch)
indices = torch.randperm(len(self.dataset), generator=g).tolist()
else:
indices = list(range(len(self.dataset)))

# Add extra samples to make it evenly divisible
indices = [ele for ele in indices for i in range(self.repetitions)]
indices += indices[: (self.total_size - len(indices))]
assert len(indices) == self.total_size

# Subsample
indices = indices[self.rank : self.total_size : self.num_replicas]
assert len(indices) == self.num_samples

return iter(indices[: self.num_selected_samples])

def __len__(self):
return self.num_selected_samples

def set_epoch(self, epoch):
self.epoch = epoch

数据增强措施CutMix和MixUp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
import math
from typing import Tuple

import torch
from torch import Tensor
from torchvision.transforms import functional as F


class RandomMixup(torch.nn.Module):
"""Randomly apply Mixup to the provided batch and targets.
The class implements the data augmentations as described in the paper
`"mixup: Beyond Empirical Risk Minimization" <https://arxiv.org/abs/1710.09412>`_.

Args:
num_classes (int): number of classes used for one-hot encoding.
p (float): probability of the batch being transformed. Default value is 0.5.
alpha (float): hyperparameter of the Beta distribution used for mixup.
Default value is 1.0.
inplace (bool): boolean to make this transform inplace. Default set to False.
"""

def __init__(self, num_classes: int, p: float = 0.5, alpha: float = 1.0, inplace: bool = False) -> None:
super().__init__()

if num_classes < 1:
raise ValueError(
f"Please provide a valid positive value for the num_classes. Got num_classes={num_classes}"
)

if alpha <= 0:
raise ValueError("Alpha param can't be zero.")

self.num_classes = num_classes
self.p = p
self.alpha = alpha
self.inplace = inplace

def forward(self, batch: Tensor, target: Tensor) -> Tuple[Tensor, Tensor]:
"""
Args:
batch (Tensor): Float tensor of size (B, C, H, W)
target (Tensor): Integer tensor of size (B, )

Returns:
Tensor: Randomly transformed batch.
"""
if batch.ndim != 4:
raise ValueError(f"Batch ndim should be 4. Got {batch.ndim}")
if target.ndim != 1:
raise ValueError(f"Target ndim should be 1. Got {target.ndim}")
if not batch.is_floating_point():
raise TypeError(f"Batch dtype should be a float tensor. Got {batch.dtype}.")
if target.dtype != torch.int64:
raise TypeError(f"Target dtype should be torch.int64. Got {target.dtype}")

if not self.inplace:
batch = batch.clone()
target = target.clone()

if target.ndim == 1:
target = torch.nn.functional.one_hot(target, num_classes=self.num_classes).to(dtype=batch.dtype)

if torch.rand(1).item() >= self.p:
return batch, target

# It's faster to roll the batch by one instead of shuffling it to create image pairs
batch_rolled = batch.roll(1, 0)
target_rolled = target.roll(1, 0)

# Implemented as on mixup paper, page 3.
lambda_param = float(torch._sample_dirichlet(torch.tensor([self.alpha, self.alpha]))[0])
batch_rolled.mul_(1.0 - lambda_param)
batch.mul_(lambda_param).add_(batch_rolled)

target_rolled.mul_(1.0 - lambda_param)
target.mul_(lambda_param).add_(target_rolled)

return batch, target

def __repr__(self) -> str:
s = (
f"{self.__class__.__name__}("
f"num_classes={self.num_classes}"
f", p={self.p}"
f", alpha={self.alpha}"
f", inplace={self.inplace}"
f")"
)
return s


class RandomCutmix(torch.nn.Module):
"""Randomly apply Cutmix to the provided batch and targets.
The class implements the data augmentations as described in the paper
`"CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features"
<https://arxiv.org/abs/1905.04899>`_.

Args:
num_classes (int): number of classes used for one-hot encoding.
p (float): probability of the batch being transformed. Default value is 0.5.
alpha (float): hyperparameter of the Beta distribution used for cutmix.
Default value is 1.0.
inplace (bool): boolean to make this transform inplace. Default set to False.
"""

def __init__(self, num_classes: int, p: float = 0.5, alpha: float = 1.0, inplace: bool = False) -> None:
super().__init__()
if num_classes < 1:
raise ValueError("Please provide a valid positive value for the num_classes.")
if alpha <= 0:
raise ValueError("Alpha param can't be zero.")

self.num_classes = num_classes
self.p = p
self.alpha = alpha
self.inplace = inplace

def forward(self, batch: Tensor, target: Tensor) -> Tuple[Tensor, Tensor]:
"""
Args:
batch (Tensor): Float tensor of size (B, C, H, W)
target (Tensor): Integer tensor of size (B, )

Returns:
Tensor: Randomly transformed batch.
"""
if batch.ndim != 4:
raise ValueError(f"Batch ndim should be 4. Got {batch.ndim}")
if target.ndim != 1:
raise ValueError(f"Target ndim should be 1. Got {target.ndim}")
if not batch.is_floating_point():
raise TypeError(f"Batch dtype should be a float tensor. Got {batch.dtype}.")
if target.dtype != torch.int64:
raise TypeError(f"Target dtype should be torch.int64. Got {target.dtype}")

if not self.inplace:
batch = batch.clone()
target = target.clone()

if target.ndim == 1:
target = torch.nn.functional.one_hot(target, num_classes=self.num_classes).to(dtype=batch.dtype)

if torch.rand(1).item() >= self.p:
return batch, target

# It's faster to roll the batch by one instead of shuffling it to create image pairs
batch_rolled = batch.roll(1, 0)
target_rolled = target.roll(1, 0)

# Implemented as on cutmix paper, page 12 (with minor corrections on typos).
lambda_param = float(torch._sample_dirichlet(torch.tensor([self.alpha, self.alpha]))[0])
_, H, W = F.get_dimensions(batch)

r_x = torch.randint(W, (1,))
r_y = torch.randint(H, (1,))

r = 0.5 * math.sqrt(1.0 - lambda_param)
r_w_half = int(r * W)
r_h_half = int(r * H)

x1 = int(torch.clamp(r_x - r_w_half, min=0))
y1 = int(torch.clamp(r_y - r_h_half, min=0))
x2 = int(torch.clamp(r_x + r_w_half, max=W))
y2 = int(torch.clamp(r_y + r_h_half, max=H))

batch[:, :, y1:y2, x1:x2] = batch_rolled[:, :, y1:y2, x1:x2]
lambda_param = float(1.0 - (x2 - x1) * (y2 - y1) / (W * H))

target_rolled.mul_(1.0 - lambda_param)
target.mul_(lambda_param).add_(target_rolled)

return batch, target

def __repr__(self) -> str:
s = (
f"{self.__class__.__name__}("
f"num_classes={self.num_classes}"
f", p={self.p}"
f", alpha={self.alpha}"
f", inplace={self.inplace}"
f")"
)
return s

数值平滑,减轻训练过程中损失函数数值的抖动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
import copy
import datetime
import errno
import hashlib
import os
import time
from collections import defaultdict, deque, OrderedDict
from typing import List, Optional, Tuple

import torch
import torch.distributed as dist


class SmoothedValue:
"""Track a series of values and provide access to smoothed values over a
window or the global series average.
"""

def __init__(self, window_size=20, fmt=None):
if fmt is None:
fmt = "{median:.4f} ({global_avg:.4f})"
self.deque = deque(maxlen=window_size)
self.total = 0.0
self.count = 0
self.fmt = fmt

def update(self, value, n=1):
self.deque.append(value)
self.count += n
self.total += value * n

def synchronize_between_processes(self):
"""
Warning: does not synchronize the deque!
"""
t = reduce_across_processes([self.count, self.total])
t = t.tolist()
self.count = int(t[0])
self.total = t[1]

@property
def median(self):
d = torch.tensor(list(self.deque))
return d.median().item()

@property
def avg(self):
d = torch.tensor(list(self.deque), dtype=torch.float32)
return d.mean().item()

@property
def global_avg(self):
return self.total / self.count

@property
def max(self):
return max(self.deque)

@property
def value(self):
return self.deque[-1]

def __str__(self):
return self.fmt.format(
median=self.median, avg=self.avg, global_avg=self.global_avg, max=self.max, value=self.value
)


class MetricLogger:
def __init__(self, delimiter="\t"):
self.meters = defaultdict(SmoothedValue)
self.delimiter = delimiter

def update(self, **kwargs):
for k, v in kwargs.items():
if isinstance(v, torch.Tensor):
v = v.item()
assert isinstance(v, (float, int))
self.meters[k].update(v)

def __getattr__(self, attr):
if attr in self.meters:
return self.meters[attr]
if attr in self.__dict__:
return self.__dict__[attr]
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{attr}'")

def __str__(self):
loss_str = []
for name, meter in self.meters.items():
loss_str.append(f"{name}: {str(meter)}")
return self.delimiter.join(loss_str)

def synchronize_between_processes(self):
for meter in self.meters.values():
meter.synchronize_between_processes()

def add_meter(self, name, meter):
self.meters[name] = meter

def log_every(self, iterable, print_freq, header=None):
i = 0
if not header:
header = ""
start_time = time.time()
end = time.time()
iter_time = SmoothedValue(fmt="{avg:.4f}")
data_time = SmoothedValue(fmt="{avg:.4f}")
space_fmt = ":" + str(len(str(len(iterable)))) + "d"
if torch.cuda.is_available():
log_msg = self.delimiter.join(
[
header,
"[{0" + space_fmt + "}/{1}]",
"eta: {eta}",
"{meters}",
"time: {time}",
"data: {data}",
"max mem: {memory:.0f}",
]
)
else:
log_msg = self.delimiter.join(
[header, "[{0" + space_fmt + "}/{1}]", "eta: {eta}", "{meters}", "time: {time}", "data: {data}"]
)
MB = 1024.0 * 1024.0
for obj in iterable:
data_time.update(time.time() - end)
yield obj
iter_time.update(time.time() - end)
if i % print_freq == 0:
eta_seconds = iter_time.global_avg * (len(iterable) - i)
eta_string = str(datetime.timedelta(seconds=int(eta_seconds)))
if torch.cuda.is_available():
print(
log_msg.format(
i,
len(iterable),
eta=eta_string,
meters=str(self),
time=str(iter_time),
data=str(data_time),
memory=torch.cuda.max_memory_allocated() / MB,
)
)
else:
print(
log_msg.format(
i, len(iterable), eta=eta_string, meters=str(self), time=str(iter_time), data=str(data_time)
)
)
i += 1
end = time.time()
total_time = time.time() - start_time
total_time_str = str(datetime.timedelta(seconds=int(total_time)))
print(f"{header} Total time: {total_time_str}")


class ExponentialMovingAverage(torch.optim.swa_utils.AveragedModel):
"""Maintains moving averages of model parameters using an exponential decay.
``ema_avg = decay * avg_model_param + (1 - decay) * model_param``
`torch.optim.swa_utils.AveragedModel <https://pytorch.org/docs/stable/optim.html#custom-averaging-strategies>`_
is used to compute the EMA.
"""

def __init__(self, model, decay, device="cpu"):
def ema_avg(avg_model_param, model_param, num_averaged):
return decay * avg_model_param + (1 - decay) * model_param

super().__init__(model, device, ema_avg, use_buffers=True)


def accuracy(output, target, topk=(1,)):
"""Computes the accuracy over the k top predictions for the specified values of k"""
with torch.inference_mode():
maxk = max(topk)
batch_size = target.size(0)
if target.ndim == 2:
target = target.max(dim=1)[1]

_, pred = output.topk(maxk, 1, True, True)
pred = pred.t()
correct = pred.eq(target[None])

res = []
for k in topk:
correct_k = correct[:k].flatten().sum(dtype=torch.float32)
res.append(correct_k * (100.0 / batch_size))
return res


def mkdir(path):
try:
os.makedirs(path)
except OSError as e:
if e.errno != errno.EEXIST:
raise


def setup_for_distributed(is_master):
"""
This function disables printing when not in master process
"""
import builtins as __builtin__

builtin_print = __builtin__.print

def print(*args, **kwargs):
force = kwargs.pop("force", False)
if is_master or force:
builtin_print(*args, **kwargs)

__builtin__.print = print


def is_dist_avail_and_initialized():
if not dist.is_available():
return False
if not dist.is_initialized():
return False
return True


def get_world_size():
if not is_dist_avail_and_initialized():
return 1
return dist.get_world_size()


def get_rank():
if not is_dist_avail_and_initialized():
return 0
return dist.get_rank()


def is_main_process():
return get_rank() == 0


def save_on_master(*args, **kwargs):
if is_main_process():
torch.save(*args, **kwargs)


def init_distributed_mode(args):
if "RANK" in os.environ and "WORLD_SIZE" in os.environ:
args.rank = int(os.environ["RANK"])
args.world_size = int(os.environ["WORLD_SIZE"])
args.gpu = int(os.environ["LOCAL_RANK"])
elif "SLURM_PROCID" in os.environ:
args.rank = int(os.environ["SLURM_PROCID"])
args.gpu = args.rank % torch.cuda.device_count()
elif hasattr(args, "rank"):
pass
else:
print("Not using distributed mode")
args.distributed = False
return

args.distributed = True

torch.cuda.set_device(args.gpu)
args.dist_backend = "nccl"
print(f"| distributed init (rank {args.rank}): {args.dist_url}", flush=True)
torch.distributed.init_process_group(
backend=args.dist_backend, init_method=args.dist_url, world_size=args.world_size, rank=args.rank
)
torch.distributed.barrier()
setup_for_distributed(args.rank == 0)


def average_checkpoints(inputs):
"""Loads checkpoints from inputs and returns a model with averaged weights. Original implementation taken from:
https://github.com/pytorch/fairseq/blob/a48f235636557b8d3bc4922a6fa90f3a0fa57955/scripts/average_checkpoints.py#L16

Args:
inputs (List[str]): An iterable of string paths of checkpoints to load from.
Returns:
A dict of string keys mapping to various values. The 'model' key
from the returned dict should correspond to an OrderedDict mapping
string parameter names to torch Tensors.
"""
params_dict = OrderedDict()
params_keys = None
new_state = None
num_models = len(inputs)
for fpath in inputs:
with open(fpath, "rb") as f:
state = torch.load(
f,
map_location=(lambda s, _: torch.serialization.default_restore_location(s, "cpu")),
)
# Copies over the settings from the first checkpoint
if new_state is None:
new_state = state
model_params = state["model"]
model_params_keys = list(model_params.keys())
if params_keys is None:
params_keys = model_params_keys
elif params_keys != model_params_keys:
raise KeyError(
f"For checkpoint {f}, expected list of params: {params_keys}, but found: {model_params_keys}"
)
for k in params_keys:
p = model_params[k]
if isinstance(p, torch.HalfTensor):
p = p.float()
if k not in params_dict:
params_dict[k] = p.clone()
# NOTE: clone() is needed in case of p is a shared parameter
else:
params_dict[k] += p
averaged_params = OrderedDict()
for k, v in params_dict.items():
averaged_params[k] = v
if averaged_params[k].is_floating_point():
averaged_params[k].div_(num_models)
else:
averaged_params[k] //= num_models
new_state["model"] = averaged_params
return new_state


def store_model_weights(model, checkpoint_path, checkpoint_key="model", strict=True):
"""
This method can be used to prepare weights files for new models. It receives as
input a model architecture and a checkpoint from the training script and produces
a file with the weights ready for release.

Examples:
from torchvision import models as M

# Classification
model = M.mobilenet_v3_large(weights=None)
print(store_model_weights(model, './class.pth'))

# Quantized Classification
model = M.quantization.mobilenet_v3_large(weights=None, quantize=False)
model.fuse_model(is_qat=True)
model.qconfig = torch.ao.quantization.get_default_qat_qconfig('qnnpack')
_ = torch.ao.quantization.prepare_qat(model, inplace=True)
print(store_model_weights(model, './qat.pth'))

# Object Detection
model = M.detection.fasterrcnn_mobilenet_v3_large_fpn(weights=None, weights_backbone=None)
print(store_model_weights(model, './obj.pth'))

# Segmentation
model = M.segmentation.deeplabv3_mobilenet_v3_large(weights=None, weights_backbone=None, aux_loss=True)
print(store_model_weights(model, './segm.pth', strict=False))

Args:
model (pytorch.nn.Module): The model on which the weights will be loaded for validation purposes.
checkpoint_path (str): The path of the checkpoint we will load.
checkpoint_key (str, optional): The key of the checkpoint where the model weights are stored.
Default: "model".
strict (bool): whether to strictly enforce that the keys
in :attr:`state_dict` match the keys returned by this module's
:meth:`~torch.nn.Module.state_dict` function. Default: ``True``

Returns:
output_path (str): The location where the weights are saved.
"""
# Store the new model next to the checkpoint_path
checkpoint_path = os.path.abspath(checkpoint_path)
output_dir = os.path.dirname(checkpoint_path)

# Deep copy to avoid side-effects on the model object.
model = copy.deepcopy(model)
checkpoint = torch.load(checkpoint_path, map_location="cpu")

# Load the weights to the model to validate that everything works
# and remove unnecessary weights (such as auxiliaries, etc)
if checkpoint_key == "model_ema":
del checkpoint[checkpoint_key]["n_averaged"]
torch.nn.modules.utils.consume_prefix_in_state_dict_if_present(checkpoint[checkpoint_key], "module.")
model.load_state_dict(checkpoint[checkpoint_key], strict=strict)

tmp_path = os.path.join(output_dir, str(model.__hash__()))
torch.save(model.state_dict(), tmp_path)

sha256_hash = hashlib.sha256()
with open(tmp_path, "rb") as f:
# Read and update hash string value in blocks of 4K
for byte_block in iter(lambda: f.read(4096), b""):
sha256_hash.update(byte_block)
hh = sha256_hash.hexdigest()

output_path = os.path.join(output_dir, "weights-" + str(hh[:8]) + ".pth")
os.replace(tmp_path, output_path)

return output_path


def reduce_across_processes(val):
if not is_dist_avail_and_initialized():
# nothing to sync, but we still convert to tensor for consistency with the distributed case.
return torch.tensor(val)

t = torch.tensor(val, device="cuda")
dist.barrier()
dist.all_reduce(t)
return t


def set_weight_decay(
model: torch.nn.Module,
weight_decay: float,
norm_weight_decay: Optional[float] = None,
norm_classes: Optional[List[type]] = None,
custom_keys_weight_decay: Optional[List[Tuple[str, float]]] = None,
):
if not norm_classes:
norm_classes = [
torch.nn.modules.batchnorm._BatchNorm,
torch.nn.LayerNorm,
torch.nn.GroupNorm,
torch.nn.modules.instancenorm._InstanceNorm,
torch.nn.LocalResponseNorm,
]
norm_classes = tuple(norm_classes)

params = {
"other": [],
"norm": [],
}
params_weight_decay = {
"other": weight_decay,
"norm": norm_weight_decay,
}
custom_keys = []
if custom_keys_weight_decay is not None:
for key, weight_decay in custom_keys_weight_decay:
params[key] = []
params_weight_decay[key] = weight_decay
custom_keys.append(key)

def _add_params(module, prefix=""):
for name, p in module.named_parameters(recurse=False):
if not p.requires_grad:
continue
is_custom_key = False
for key in custom_keys:
target_name = f"{prefix}.{name}" if prefix != "" and "." in key else name
if key == target_name:
params[key].append(p)
is_custom_key = True
break
if not is_custom_key:
if norm_weight_decay is not None and isinstance(module, norm_classes):
params["norm"].append(p)
else:
params["other"].append(p)

for child_name, child_module in module.named_children():
child_prefix = f"{prefix}.{child_name}" if prefix != "" else child_name
_add_params(child_module, prefix=child_prefix)

_add_params(model)

param_groups = []
for key in params:
if len(params[key]) > 0:
param_groups.append({"params": params[key], "weight_decay": params_weight_decay[key]})
return param_groups

训练和评估函数,本次分类没有用官方给定的主函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
import datetime
import os
import time
import warnings

import torch
import torch.utils.data
import torchvision
from torch import nn
from torch.utils.data.dataloader import default_collate
from torchvision.transforms.functional import InterpolationMode


def train_one_epoch(model, criterion, optimizer, data_loader, device, epoch, print_freq=10, model_ema=None, scaler=None, clip_grad_norm=None, lr_warmup_epochs=10, model_ema_steps=5):
model.train()
metric_logger = MetricLogger(delimiter=" ")
metric_logger.add_meter("lr", SmoothedValue(window_size=1, fmt="{value}"))
metric_logger.add_meter("img/s", SmoothedValue(window_size=10, fmt="{value}"))

header = f"Epoch: [{epoch}]"
for i, (image, target) in enumerate(metric_logger.log_every(data_loader, print_freq, header)):
start_time = time.time()
image, target = image.to(device), target.to(device)
with torch.cuda.amp.autocast(enabled=scaler is not None):
output = model(image)
loss = criterion(output, target)

optimizer.zero_grad()
if scaler is not None:
scaler.scale(loss).backward()
if clip_grad_norm is not None:
# we should unscale the gradients of optimizer's assigned params if do gradient clipping
scaler.unscale_(optimizer)
nn.utils.clip_grad_norm_(model.parameters(), clip_grad_norm)
scaler.step(optimizer)
scaler.update()
else:
loss.backward()
if clip_grad_norm is not None:
nn.utils.clip_grad_norm_(model.parameters(), clip_grad_norm)
optimizer.step()

if model_ema and i % model_ema_steps == 0:
model_ema.update_parameters(model)
if epoch < lr_warmup_epochs:
# Reset ema buffer to keep copying weights during warmup period
model_ema.n_averaged.fill_(0)

acc1, acc5 = accuracy(output, target, topk=(1, 5))
batch_size = image.shape[0]
metric_logger.update(loss=loss.item(), lr=optimizer.param_groups[0]["lr"])
metric_logger.meters["acc1"].update(acc1.item(), n=batch_size)
metric_logger.meters["acc5"].update(acc5.item(), n=batch_size)
metric_logger.meters["img/s"].update(batch_size / (time.time() - start_time))


def evaluate(model, criterion, data_loader, device, print_freq=100, log_suffix=""):
model.eval()
metric_logger = MetricLogger(delimiter=" ")
header = f"Test: {log_suffix}"

num_processed_samples = 0
with torch.inference_mode():
for image, target in metric_logger.log_every(data_loader, print_freq, header):
image = image.to(device, non_blocking=True)
target = target.to(device, non_blocking=True)
output = model(image)
loss = criterion(output, target)

acc1, acc5 = accuracy(output, target, topk=(1, 5))
# FIXME need to take into account that the datasets
# could have been padded in distributed setup
batch_size = image.shape[0]
metric_logger.update(loss=loss.item())
metric_logger.meters["acc1"].update(acc1.item(), n=batch_size)
metric_logger.meters["acc5"].update(acc5.item(), n=batch_size)
num_processed_samples += batch_size
# gather the stats from all processes

num_processed_samples = reduce_across_processes(num_processed_samples)
if (
hasattr(data_loader.dataset, "__len__")
and len(data_loader.dataset) != num_processed_samples
and torch.distributed.get_rank() == 0
):
# See FIXME above
warnings.warn(
f"It looks like the dataset has {len(data_loader.dataset)} samples, but {num_processed_samples} "
"samples were used for the validation, which might bias the results. "
"Try adjusting the batch size and / or the world size. "
"Setting the world size to 1 is always a safe bet."
)

metric_logger.synchronize_between_processes()

print(f"{header} Acc@1 {metric_logger.acc1.global_avg:.3f} Acc@5 {metric_logger.acc5.global_avg:.3f}")
return metric_logger.acc1.global_avg


def _get_cache_path(filepath):
import hashlib

h = hashlib.sha1(filepath.encode()).hexdigest()
cache_path = os.path.join("~", ".torch", "vision", "datasets", "imagefolder", h[:10] + ".pt")
cache_path = os.path.expanduser(cache_path)
return cache_path


def load_data(traindir, valdir, args):
# Data loading code
print("Loading data")
val_resize_size, val_crop_size, train_crop_size = (
args.val_resize_size,
args.val_crop_size,
args.train_crop_size,
)
interpolation = InterpolationMode(args.interpolation)

print("Loading training data")
st = time.time()
cache_path = _get_cache_path(traindir)
if args.cache_dataset and os.path.exists(cache_path):
# Attention, as the transforms are also cached!
print(f"Loading dataset_train from {cache_path}")
dataset, _ = torch.load(cache_path)
else:
auto_augment_policy = getattr(args, "auto_augment", None)
random_erase_prob = getattr(args, "random_erase", 0.0)
ra_magnitude = args.ra_magnitude
augmix_severity = args.augmix_severity
dataset = torchvision.datasets.ImageFolder(
traindir,
ClassificationPresetTrain(
crop_size=train_crop_size,
interpolation=interpolation,
auto_augment_policy=auto_augment_policy,
random_erase_prob=random_erase_prob,
ra_magnitude=ra_magnitude,
augmix_severity=augmix_severity,
),
)
if args.cache_dataset:
print(f"Saving dataset_train to {cache_path}")
mkdir(os.path.dirname(cache_path))
save_on_master((dataset, traindir), cache_path)
print("Took", time.time() - st)

print("Loading validation data")
cache_path = _get_cache_path(valdir)
if args.cache_dataset and os.path.exists(cache_path):
# Attention, as the transforms are also cached!
print(f"Loading dataset_test from {cache_path}")
dataset_test, _ = torch.load(cache_path)
else:
if args.weights and args.test_only:
weights = torchvision.models.get_weight(args.weights)
preprocessing = weights.transforms()
else:
preprocessing = ClassificationPresetEval(
crop_size=val_crop_size, resize_size=val_resize_size, interpolation=interpolation
)

dataset_test = torchvision.datasets.ImageFolder(
valdir,
preprocessing,
)
if args.cache_dataset:
print(f"Saving dataset_test to {cache_path}")
mkdir(os.path.dirname(cache_path))
save_on_master((dataset_test, valdir), cache_path)

print("Creating data loaders")
if args.distributed:
if hasattr(args, "ra_sampler") and args.ra_sampler:
train_sampler = RASampler(dataset, shuffle=True, repetitions=args.ra_reps)
else:
train_sampler = torch.utils.data.distributed.DistributedSampler(dataset)
test_sampler = torch.utils.data.distributed.DistributedSampler(dataset_test, shuffle=False)
else:
train_sampler = torch.utils.data.RandomSampler(dataset)
test_sampler = torch.utils.data.SequentialSampler(dataset_test)

return dataset, dataset_test, train_sampler, test_sampler


def main(args):
if args.output_dir:
mkdir(args.output_dir)

init_distributed_mode(args)
print(args)

device = torch.device(args.device)

if args.use_deterministic_algorithms:
torch.backends.cudnn.benchmark = False
torch.use_deterministic_algorithms(True)
else:
torch.backends.cudnn.benchmark = True

train_dir = os.path.join(args.data_path, "train")
val_dir = os.path.join(args.data_path, "val")
dataset, dataset_test, train_sampler, test_sampler = load_data(train_dir, val_dir, args)

collate_fn = None
num_classes = len(dataset.classes)
mixup_transforms = []
if args.mixup_alpha > 0.0:
mixup_transforms.append(RandomMixup(num_classes, p=1.0, alpha=args.mixup_alpha))
if args.cutmix_alpha > 0.0:
mixup_transforms.append(RandomCutmix(num_classes, p=1.0, alpha=args.cutmix_alpha))
if mixup_transforms:
mixupcutmix = torchvision.transforms.RandomChoice(mixup_transforms)

def collate_fn(batch):
return mixupcutmix(*default_collate(batch))

data_loader = torch.utils.data.DataLoader(
dataset,
batch_size=args.batch_size,
sampler=train_sampler,
num_workers=args.workers,
pin_memory=True,
collate_fn=collate_fn,
)
data_loader_test = torch.utils.data.DataLoader(
dataset_test, batch_size=args.batch_size, sampler=test_sampler, num_workers=args.workers, pin_memory=True
)

print("Creating model")
model = torchvision.models.get_model(args.model, weights=args.weights, num_classes=num_classes)
model.to(device)

if args.distributed and args.sync_bn:
model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)

criterion = nn.CrossEntropyLoss(label_smoothing=args.label_smoothing)

custom_keys_weight_decay = []
if args.bias_weight_decay is not None:
custom_keys_weight_decay.append(("bias", args.bias_weight_decay))
if args.transformer_embedding_decay is not None:
for key in ["class_token", "position_embedding", "relative_position_bias_table"]:
custom_keys_weight_decay.append((key, args.transformer_embedding_decay))
parameters = set_weight_decay(
model,
args.weight_decay,
norm_weight_decay=args.norm_weight_decay,
custom_keys_weight_decay=custom_keys_weight_decay if len(custom_keys_weight_decay) > 0 else None,
)

opt_name = args.opt.lower()
if opt_name.startswith("sgd"):
optimizer = torch.optim.SGD(
parameters,
lr=args.lr,
momentum=args.momentum,
weight_decay=args.weight_decay,
nesterov="nesterov" in opt_name,
)
elif opt_name == "rmsprop":
optimizer = torch.optim.RMSprop(
parameters, lr=args.lr, momentum=args.momentum, weight_decay=args.weight_decay, eps=0.0316, alpha=0.9
)
elif opt_name == "adamw":
optimizer = torch.optim.AdamW(parameters, lr=args.lr, weight_decay=args.weight_decay)
else:
raise RuntimeError(f"Invalid optimizer {args.opt}. Only SGD, RMSprop and AdamW are supported.")

scaler = torch.cuda.amp.GradScaler() if args.amp else None

args.lr_scheduler = args.lr_scheduler.lower()
if args.lr_scheduler == "steplr":
main_lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=args.lr_step_size, gamma=args.lr_gamma)
elif args.lr_scheduler == "cosineannealinglr":
main_lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
optimizer, T_max=args.epochs - args.lr_warmup_epochs, eta_min=args.lr_min
)
elif args.lr_scheduler == "exponentiallr":
main_lr_scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=args.lr_gamma)
else:
raise RuntimeError(
f"Invalid lr scheduler '{args.lr_scheduler}'. Only StepLR, CosineAnnealingLR and ExponentialLR "
"are supported."
)

if args.lr_warmup_epochs > 0:
if args.lr_warmup_method == "linear":
warmup_lr_scheduler = torch.optim.lr_scheduler.LinearLR(
optimizer, start_factor=args.lr_warmup_decay, total_iters=args.lr_warmup_epochs
)
elif args.lr_warmup_method == "constant":
warmup_lr_scheduler = torch.optim.lr_scheduler.ConstantLR(
optimizer, factor=args.lr_warmup_decay, total_iters=args.lr_warmup_epochs
)
else:
raise RuntimeError(
f"Invalid warmup lr method '{args.lr_warmup_method}'. Only linear and constant are supported."
)
lr_scheduler = torch.optim.lr_scheduler.SequentialLR(
optimizer, schedulers=[warmup_lr_scheduler, main_lr_scheduler], milestones=[args.lr_warmup_epochs]
)
else:
lr_scheduler = main_lr_scheduler

model_without_ddp = model
if args.distributed:
model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu])
model_without_ddp = model.module

model_ema = None
if args.model_ema:
# Decay adjustment that aims to keep the decay independent from other hyper-parameters originally proposed at:
# https://github.com/facebookresearch/pycls/blob/f8cd9627/pycls/core/net.py#L123
#
# total_ema_updates = (Dataset_size / n_GPUs) * epochs / (batch_size_per_gpu * EMA_steps)
# We consider constant = Dataset_size for a given dataset/setup and ommit it. Thus:
# adjust = 1 / total_ema_updates ~= n_GPUs * batch_size_per_gpu * EMA_steps / epochs
adjust = args.world_size * args.batch_size * args.model_ema_steps / args.epochs
alpha = 1.0 - args.model_ema_decay
alpha = min(1.0, alpha * adjust)
model_ema = ExponentialMovingAverage(model_without_ddp, device=device, decay=1.0 - alpha)

if args.resume:
checkpoint = torch.load(args.resume, map_location="cpu")
model_without_ddp.load_state_dict(checkpoint["model"])
if not args.test_only:
optimizer.load_state_dict(checkpoint["optimizer"])
lr_scheduler.load_state_dict(checkpoint["lr_scheduler"])
args.start_epoch = checkpoint["epoch"] + 1
if model_ema:
model_ema.load_state_dict(checkpoint["model_ema"])
if scaler:
scaler.load_state_dict(checkpoint["scaler"])

if args.test_only:
# We disable the cudnn benchmarking because it can noticeably affect the accuracy
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
if model_ema:
evaluate(model_ema, criterion, data_loader_test, device=device, log_suffix="EMA")
else:
evaluate(model, criterion, data_loader_test, device=device)
return

print("Start training")
start_time = time.time()
for epoch in range(args.start_epoch, args.epochs):
if args.distributed:
train_sampler.set_epoch(epoch)
train_one_epoch(model, criterion, optimizer, data_loader, device, epoch, args, model_ema, scaler)
lr_scheduler.step()
evaluate(model, criterion, data_loader_test, device=device)
if model_ema:
evaluate(model_ema, criterion, data_loader_test, device=device, log_suffix="EMA")
if args.output_dir:
checkpoint = {
"model": model_without_ddp.state_dict(),
"optimizer": optimizer.state_dict(),
"lr_scheduler": lr_scheduler.state_dict(),
"epoch": epoch,
"args": args,
}
if model_ema:
checkpoint["model_ema"] = model_ema.state_dict()
if scaler:
checkpoint["scaler"] = scaler.state_dict()
save_on_master(checkpoint, os.path.join(args.output_dir, f"model_{epoch}.pth"))
save_on_master(checkpoint, os.path.join(args.output_dir, "checkpoint.pth"))

total_time = time.time() - start_time
total_time_str = str(datetime.timedelta(seconds=int(total_time)))
print(f"Training time {total_time_str}")

训练和评估

1
from torchvision.transforms import Compose, ToTensor, PILToTensor
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
data_train = MNIST('.data/mnist', train=True, download=True, transform=ToTensor())
data_val = MNIST('./data/mnist', train=False, download=True, transform=ToTensor())
num_classes = len(data_train.classes)
sampler_train = torch.utils.data.RandomSampler(data_train, replacement=False)
data_loader_train = torch.utils.data.DataLoader(
data_train, sampler=sampler_train,
batch_size=128,
num_workers=2,
pin_memory=True,
drop_last=True, # 当不足一个batch时候,使丢弃后面一部分还是随机增加一部分样本
)
sampler_val = torch.utils.data.SequentialSampler(data_val)
data_loader_val = torch.utils.data.DataLoader(
data_val, sampler=sampler_val,
batch_size=128,
shuffle=False,
num_workers=2,
pin_memory=True,
drop_last=False
)
# timm库里面自带的mixup,0.8的MIXUP和0.8的CUTMIX,此处没用到
mixup_fn = Mixup(
# mixup=0.8,cutmix=0.8,cutmix_minmax is None,mixup_prob=1, mixup_switch_prob=0.5, mixup_mode='batch', label_smoothing=0.1
mixup_alpha=0.8, cutmix_alpha=0.8, prob=0.8, switch_prob=0.5, mode='batch',
label_smoothing=0.1, num_classes=num_classes)
1
2
3
4
5
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = SwinTransformer(img_size=28, patch_size=7, in_chans=1, num_classes=10, embed_dim=96, depths=[2, 2, 4], num_heads=[3, 6, 12], window_size=2, mlp_ratio=4., qkv_bias=True, qk_scale=True, drop_rate=0., drop_path_rate=0., ape=False, patch_norm=True, use_checkpoint=False, fused_window_process=False).to(device)
optimizer = Nadam(model.parameters())
loss_fn = LabelSmoothingCrossEntropy()
lr_scheduler = CosineLRScheduler(optimizer, t_initial=15)
1
2
3
4
5
6
7
8
9
print("Start training")
start_time = time.time()
for epoch in range(100):
train_one_epoch(model, loss_fn, optimizer, data_loader_train, device, epoch)
lr_scheduler.step(epoch)
evaluate(model, loss_fn, data_loader_val, device='cuda')
total_time = time.time() - start_time
total_time_str = str(datetime.timedelta(seconds=int(total_time)))
print(f"Training time {total_time_str}")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
Start training
Epoch: [0] [ 0/468] eta: 0:53:11 lr: 0.002 img/s: 19.040995153557684 loss: 2.3678 (2.3678) acc1: 8.5938 (8.5938) acc5: 49.2188 (49.2188) time: 6.8184 data: 0.0960 max mem: 138
Epoch: [0] [ 10/468] eta: 0:05:01 lr: 0.002 img/s: 3501.2711431105545 loss: 2.6789 (2.8085) acc1: 11.7188 (12.9972) acc5: 52.3438 (52.9830) time: 0.6580 data: 0.0090 max mem: 190
Epoch: [0] [ 20/468] eta: 0:02:43 lr: 0.002 img/s: 3585.1146043405674 loss: 2.4055 (2.5948) acc1: 11.7188 (12.7232) acc5: 53.9062 (56.5848) time: 0.0412 data: 0.0003 max mem: 190
Epoch: [0] [ 30/468] eta: 0:01:53 lr: 0.002 img/s: 3649.6010441592343 loss: 2.2687 (2.4918) acc1: 12.5000 (13.7853) acc5: 64.0625 (59.5010) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [0] [ 40/468] eta: 0:01:27 lr: 0.002 img/s: 3546.464652336473 loss: 2.2370 (2.4254) acc1: 17.1875 (14.8438) acc5: 67.1875 (61.1852) time: 0.0361 data: 0.0003 max mem: 190
Epoch: [0] [ 50/468] eta: 0:01:11 lr: 0.002 img/s: 3591.8786094682473 loss: 2.2039 (2.3840) acc1: 17.1875 (15.4105) acc5: 67.1875 (62.3775) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [0] [ 60/468] eta: 0:01:00 lr: 0.002 img/s: 3516.176414340542 loss: 2.0975 (2.3266) acc1: 21.0938 (17.5717) acc5: 74.2188 (65.4201) time: 0.0363 data: 0.0002 max mem: 190
Epoch: [0] [ 70/468] eta: 0:00:53 lr: 0.002 img/s: 3595.655457400995 loss: 1.9138 (2.2534) acc1: 32.8125 (20.4005) acc5: 88.2812 (69.0361) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [0] [ 80/468] eta: 0:00:47 lr: 0.002 img/s: 3523.9311585165738 loss: 1.6661 (2.1731) acc1: 46.0938 (24.2091) acc5: 92.9688 (72.1065) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [0] [ 90/468] eta: 0:00:42 lr: 0.002 img/s: 3560.978423374125 loss: 1.5143 (2.0945) acc1: 54.6875 (27.6700) acc5: 94.5312 (74.6738) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [0] [100/468] eta: 0:00:38 lr: 0.002 img/s: 3397.5516052070348 loss: 1.3297 (2.0148) acc1: 61.7188 (31.3738) acc5: 96.8750 (76.8487) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [0] [110/468] eta: 0:00:35 lr: 0.002 img/s: 3430.11246062728 loss: 1.2297 (1.9384) acc1: 65.6250 (34.9733) acc5: 97.6562 (78.7796) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [0] [120/468] eta: 0:00:32 lr: 0.002 img/s: 3561.07290346973 loss: 1.1007 (1.8659) acc1: 75.0000 (38.4362) acc5: 98.4375 (80.3525) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [0] [130/468] eta: 0:00:30 lr: 0.002 img/s: 3507.3555366825635 loss: 0.9622 (1.7933) acc1: 80.4688 (41.7820) acc5: 99.2188 (81.7987) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [0] [140/468] eta: 0:00:27 lr: 0.002 img/s: 3631.1864186675684 loss: 0.9119 (1.7340) acc1: 82.0312 (44.5867) acc5: 99.2188 (83.0286) time: 0.0364 data: 0.0003 max mem: 190
Epoch: [0] [150/468] eta: 0:00:26 lr: 0.002 img/s: 3552.660252253206 loss: 0.8685 (1.6751) acc1: 83.5938 (47.2941) acc5: 99.2188 (84.1060) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [0] [160/468] eta: 0:00:24 lr: 0.002 img/s: 3542.01905365107 loss: 0.8460 (1.6237) acc1: 86.7188 (49.7234) acc5: 99.2188 (85.0398) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [0] [170/468] eta: 0:00:22 lr: 0.002 img/s: 3549.3250826391645 loss: 0.8027 (1.5745) acc1: 89.0625 (52.0514) acc5: 99.2188 (85.8872) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [0] [180/468] eta: 0:00:21 lr: 0.002 img/s: 3582.865593550626 loss: 0.7574 (1.5300) acc1: 89.0625 (54.1523) acc5: 99.2188 (86.6238) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [0] [190/468] eta: 0:00:20 lr: 0.002 img/s: 3565.306025952637 loss: 0.7399 (1.4878) acc1: 89.8438 (56.1109) acc5: 99.2188 (87.3037) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [0] [200/468] eta: 0:00:18 lr: 0.002 img/s: 3605.6771975069846 loss: 0.7423 (1.4527) acc1: 89.8438 (57.7231) acc5: 99.2188 (87.9120) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [0] [210/468] eta: 0:00:17 lr: 0.002 img/s: 3401.2969342954707 loss: 0.7311 (1.4173) acc1: 90.6250 (59.3676) acc5: 99.2188 (88.4590) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [0] [220/468] eta: 0:00:16 lr: 0.002 img/s: 3480.7953422632554 loss: 0.7156 (1.3896) acc1: 91.4062 (60.6582) acc5: 99.2188 (88.9352) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [0] [230/468] eta: 0:00:15 lr: 0.002 img/s: 3562.7034746370077 loss: 0.7577 (1.3609) acc1: 89.8438 (61.9690) acc5: 99.2188 (89.3939) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [0] [240/468] eta: 0:00:14 lr: 0.002 img/s: 3469.951602895553 loss: 0.6970 (1.3335) acc1: 92.1875 (63.1905) acc5: 100.0000 (89.8178) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [0] [250/468] eta: 0:00:14 lr: 0.002 img/s: 2312.135436719682 loss: 0.6869 (1.3075) acc1: 92.9688 (64.3800) acc5: 100.0000 (90.2141) time: 0.0424 data: 0.0008 max mem: 190
Epoch: [0] [260/468] eta: 0:00:13 lr: 0.002 img/s: 3446.49529764465 loss: 0.6936 (1.2856) acc1: 92.9688 (65.3766) acc5: 100.0000 (90.5741) time: 0.0492 data: 0.0031 max mem: 190
Epoch: [0] [270/468] eta: 0:00:12 lr: 0.002 img/s: 3455.8797038944317 loss: 0.6986 (1.2633) acc1: 91.4062 (66.3947) acc5: 100.0000 (90.9162) time: 0.0451 data: 0.0025 max mem: 190
Epoch: [0] [280/468] eta: 0:00:11 lr: 0.002 img/s: 3372.855566863935 loss: 0.6628 (1.2424) acc1: 92.9688 (67.3349) acc5: 100.0000 (91.2339) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [0] [290/468] eta: 0:00:10 lr: 0.002 img/s: 3639.0379784587644 loss: 0.6992 (1.2249) acc1: 92.1875 (68.1352) acc5: 100.0000 (91.5190) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [0] [300/468] eta: 0:00:10 lr: 0.002 img/s: 3276.9800098882383 loss: 0.7055 (1.2073) acc1: 91.4062 (68.9369) acc5: 100.0000 (91.7956) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [0] [310/468] eta: 0:00:09 lr: 0.002 img/s: 3596.450327576736 loss: 0.6860 (1.1903) acc1: 92.1875 (69.7171) acc5: 100.0000 (92.0418) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [0] [320/468] eta: 0:00:08 lr: 0.002 img/s: 3440.2452452965604 loss: 0.6749 (1.1755) acc1: 92.9688 (70.3831) acc5: 100.0000 (92.2873) time: 0.0365 data: 0.0003 max mem: 190
Epoch: [0] [330/468] eta: 0:00:08 lr: 0.002 img/s: 3599.681596309607 loss: 0.6807 (1.1608) acc1: 92.9688 (71.0583) acc5: 100.0000 (92.5038) time: 0.0365 data: 0.0003 max mem: 190
Epoch: [0] [340/468] eta: 0:00:07 lr: 0.002 img/s: 3264.029960907339 loss: 0.6687 (1.1463) acc1: 93.7500 (71.7238) acc5: 100.0000 (92.7144) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [0] [350/468] eta: 0:00:06 lr: 0.002 img/s: 2509.434432857657 loss: 0.6476 (1.1321) acc1: 93.7500 (72.3669) acc5: 100.0000 (92.9153) time: 0.0386 data: 0.0006 max mem: 190
Epoch: [0] [360/468] eta: 0:00:06 lr: 0.002 img/s: 2476.787392566006 loss: 0.6356 (1.1186) acc1: 93.7500 (72.9787) acc5: 100.0000 (93.1073) time: 0.0489 data: 0.0037 max mem: 190
Epoch: [0] [370/468] eta: 0:00:05 lr: 0.002 img/s: 2908.151346900747 loss: 0.6492 (1.1062) acc1: 93.7500 (73.5428) acc5: 100.0000 (93.2867) time: 0.0485 data: 0.0035 max mem: 190
Epoch: [0] [380/468] eta: 0:00:04 lr: 0.002 img/s: 3440.5098049268154 loss: 0.6544 (1.0938) acc1: 93.7500 (74.0957) acc5: 100.0000 (93.4568) time: 0.0406 data: 0.0004 max mem: 190
Epoch: [0] [390/468] eta: 0:00:04 lr: 0.002 img/s: 3392.3991482209317 loss: 0.6606 (1.0837) acc1: 92.9688 (74.5524) acc5: 100.0000 (93.6141) time: 0.0395 data: 0.0003 max mem: 190
Epoch: [0] [400/468] eta: 0:00:03 lr: 0.002 img/s: 3594.0427104392884 loss: 0.6609 (1.0728) acc1: 93.7500 (75.0487) acc5: 100.0000 (93.7695) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [0] [410/468] eta: 0:00:03 lr: 0.002 img/s: 3482.601694365521 loss: 0.6539 (1.0631) acc1: 92.9688 (75.4733) acc5: 100.0000 (93.9154) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [0] [420/468] eta: 0:00:02 lr: 0.002 img/s: 3460.6245576490455 loss: 0.6279 (1.0527) acc1: 95.3125 (75.9538) acc5: 100.0000 (94.0525) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [0] [430/468] eta: 0:00:02 lr: 0.002 img/s: 3554.1184196589343 loss: 0.6099 (1.0425) acc1: 95.3125 (76.4084) acc5: 100.0000 (94.1905) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [0] [440/468] eta: 0:00:01 lr: 0.002 img/s: 3696.159833667238 loss: 0.6209 (1.0333) acc1: 94.5312 (76.8282) acc5: 100.0000 (94.3187) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [0] [450/468] eta: 0:00:00 lr: 0.002 img/s: 3577.42225065302 loss: 0.6355 (1.0245) acc1: 94.5312 (77.2208) acc5: 100.0000 (94.4429) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [0] [460/468] eta: 0:00:00 lr: 0.002 img/s: 3550.005699889573 loss: 0.6098 (1.0152) acc1: 95.3125 (77.6318) acc5: 100.0000 (94.5618) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [0] Total time: 0:00:24
Test: [ 0/79] eta: 0:00:09 loss: 0.5895 (0.5895) acc1: 96.0938 (96.0938) acc5: 100.0000 (100.0000) time: 0.1171 data: 0.0978 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 95.010 Acc@5 99.930
Epoch: [1] [ 0/468] eta: 0:01:13 lr: 0.002 img/s: 2390.055122491953 loss: 0.6364 (0.6364) acc1: 94.5312 (94.5312) acc5: 100.0000 (100.0000) time: 0.1565 data: 0.1029 max mem: 190
Epoch: [1] [ 10/468] eta: 0:00:21 lr: 0.002 img/s: 3516.7061566980865 loss: 0.5902 (0.6113) acc1: 96.0938 (95.3125) acc5: 100.0000 (99.9290) time: 0.0476 data: 0.0096 max mem: 190
Epoch: [1] [ 20/468] eta: 0:00:18 lr: 0.002 img/s: 3492.6156808660126 loss: 0.6155 (0.6309) acc1: 94.5312 (94.4568) acc5: 100.0000 (99.8140) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [1] [ 30/468] eta: 0:00:21 lr: 0.002 img/s: 1832.5428361749698 loss: 0.6361 (0.6283) acc1: 94.5312 (94.7329) acc5: 100.0000 (99.8236) time: 0.0500 data: 0.0016 max mem: 190
Epoch: [1] [ 40/468] eta: 0:00:20 lr: 0.002 img/s: 3413.4288221156903 loss: 0.6217 (0.6312) acc1: 94.5312 (94.6837) acc5: 100.0000 (99.8285) time: 0.0549 data: 0.0024 max mem: 190
Epoch: [1] [ 50/468] eta: 0:00:19 lr: 0.002 img/s: 3562.2306916503 loss: 0.6316 (0.6319) acc1: 94.5312 (94.5925) acc5: 100.0000 (99.7702) time: 0.0420 data: 0.0011 max mem: 190
Epoch: [1] [ 60/468] eta: 0:00:18 lr: 0.002 img/s: 3474.66773671607 loss: 0.6233 (0.6315) acc1: 94.5312 (94.6337) acc5: 100.0000 (99.8079) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [1] [ 70/468] eta: 0:00:17 lr: 0.002 img/s: 3406.649398775342 loss: 0.6054 (0.6279) acc1: 95.3125 (94.8393) acc5: 100.0000 (99.8349) time: 0.0382 data: 0.0002 max mem: 190
Epoch: [1] [ 80/468] eta: 0:00:16 lr: 0.002 img/s: 3441.9214771124502 loss: 0.6031 (0.6286) acc1: 95.3125 (94.7531) acc5: 100.0000 (99.8553) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [1] [ 90/468] eta: 0:00:16 lr: 0.002 img/s: 3423.2448431751377 loss: 0.6496 (0.6324) acc1: 94.5312 (94.5999) acc5: 100.0000 (99.8369) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [1] [100/468] eta: 0:00:15 lr: 0.002 img/s: 3509.3960164987807 loss: 0.6414 (0.6306) acc1: 93.7500 (94.6473) acc5: 100.0000 (99.8453) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [1] [110/468] eta: 0:00:14 lr: 0.002 img/s: 3536.1166606290135 loss: 0.6056 (0.6282) acc1: 95.3125 (94.7494) acc5: 100.0000 (99.8592) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [1] [120/468] eta: 0:00:14 lr: 0.002 img/s: 3551.7201338996283 loss: 0.6056 (0.6265) acc1: 96.0938 (94.8670) acc5: 100.0000 (99.8580) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [1] [130/468] eta: 0:00:13 lr: 0.002 img/s: 3513.990038028289 loss: 0.6123 (0.6287) acc1: 95.3125 (94.7758) acc5: 100.0000 (99.8628) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [1] [140/468] eta: 0:00:13 lr: 0.002 img/s: 2365.0075857025804 loss: 0.6116 (0.6272) acc1: 95.3125 (94.8637) acc5: 100.0000 (99.8670) time: 0.0377 data: 0.0002 max mem: 190
Epoch: [1] [150/468] eta: 0:00:13 lr: 0.002 img/s: 2116.48142804204 loss: 0.6052 (0.6264) acc1: 96.0938 (94.9141) acc5: 100.0000 (99.8707) time: 0.0510 data: 0.0014 max mem: 190
Epoch: [1] [160/468] eta: 0:00:13 lr: 0.002 img/s: 2346.6791619860214 loss: 0.6072 (0.6246) acc1: 95.3125 (94.9874) acc5: 100.0000 (99.8690) time: 0.0622 data: 0.0044 max mem: 190
Epoch: [1] [170/468] eta: 0:00:12 lr: 0.002 img/s: 3253.033634881875 loss: 0.5855 (0.6233) acc1: 95.3125 (95.0612) acc5: 100.0000 (99.8629) time: 0.0514 data: 0.0036 max mem: 190
Epoch: [1] [180/468] eta: 0:00:12 lr: 0.002 img/s: 3258.898336773097 loss: 0.6087 (0.6239) acc1: 95.3125 (95.0233) acc5: 100.0000 (99.8576) time: 0.0408 data: 0.0006 max mem: 190
Epoch: [1] [190/468] eta: 0:00:11 lr: 0.002 img/s: 3504.1734623945067 loss: 0.6345 (0.6238) acc1: 94.5312 (95.0344) acc5: 100.0000 (99.8527) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [1] [200/468] eta: 0:00:11 lr: 0.002 img/s: 3558.052024998509 loss: 0.6168 (0.6241) acc1: 95.3125 (95.0249) acc5: 100.0000 (99.8445) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [1] [210/468] eta: 0:00:10 lr: 0.002 img/s: 3303.73967410033 loss: 0.6099 (0.6240) acc1: 95.3125 (95.0163) acc5: 100.0000 (99.8408) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [1] [220/468] eta: 0:00:10 lr: 0.002 img/s: 3516.1533856844394 loss: 0.6165 (0.6246) acc1: 94.5312 (94.9625) acc5: 100.0000 (99.8374) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [1] [230/468] eta: 0:00:09 lr: 0.002 img/s: 3532.231380598979 loss: 0.6224 (0.6240) acc1: 94.5312 (95.0081) acc5: 100.0000 (99.8444) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [1] [240/468] eta: 0:00:09 lr: 0.002 img/s: 3485.9936626668746 loss: 0.6139 (0.6245) acc1: 95.3125 (94.9721) acc5: 100.0000 (99.8412) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [1] [250/468] eta: 0:00:09 lr: 0.002 img/s: 3558.971905866755 loss: 0.6313 (0.6250) acc1: 95.3125 (94.9608) acc5: 100.0000 (99.8381) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [1] [260/468] eta: 0:00:08 lr: 0.002 img/s: 3356.2403070729297 loss: 0.6108 (0.6250) acc1: 95.3125 (94.9683) acc5: 100.0000 (99.8414) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [1] [270/468] eta: 0:00:08 lr: 0.002 img/s: 3548.9027618027735 loss: 0.6029 (0.6244) acc1: 96.0938 (94.9983) acc5: 100.0000 (99.8443) time: 0.0386 data: 0.0003 max mem: 190
Epoch: [1] [280/468] eta: 0:00:07 lr: 0.002 img/s: 3527.7286478388287 loss: 0.6134 (0.6240) acc1: 96.0938 (95.0317) acc5: 100.0000 (99.8443) time: 0.0386 data: 0.0003 max mem: 190
Epoch: [1] [290/468] eta: 0:00:07 lr: 0.002 img/s: 3131.92185230342 loss: 0.6139 (0.6236) acc1: 95.3125 (95.0279) acc5: 100.0000 (99.8443) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [1] [300/468] eta: 0:00:06 lr: 0.002 img/s: 2843.4905061782665 loss: 0.6021 (0.6229) acc1: 95.3125 (95.0607) acc5: 100.0000 (99.8469) time: 0.0470 data: 0.0026 max mem: 190
Epoch: [1] [310/468] eta: 0:00:06 lr: 0.002 img/s: 3476.4452214905036 loss: 0.6042 (0.6231) acc1: 95.3125 (95.0437) acc5: 100.0000 (99.8493) time: 0.0499 data: 0.0033 max mem: 190
Epoch: [1] [320/468] eta: 0:00:06 lr: 0.002 img/s: 3499.6506808685394 loss: 0.6194 (0.6227) acc1: 95.3125 (95.0594) acc5: 100.0000 (99.8491) time: 0.0403 data: 0.0009 max mem: 190
Epoch: [1] [330/468] eta: 0:00:05 lr: 0.002 img/s: 3577.5414448213132 loss: 0.6167 (0.6226) acc1: 95.3125 (95.0600) acc5: 100.0000 (99.8513) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [1] [340/468] eta: 0:00:05 lr: 0.002 img/s: 3553.3421493292035 loss: 0.6114 (0.6226) acc1: 94.5312 (95.0536) acc5: 100.0000 (99.8534) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [1] [350/468] eta: 0:00:04 lr: 0.002 img/s: 3620.0947519605133 loss: 0.6088 (0.6230) acc1: 94.5312 (95.0521) acc5: 100.0000 (99.8442) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [1] [360/468] eta: 0:00:04 lr: 0.002 img/s: 3531.1394576391585 loss: 0.6052 (0.6223) acc1: 96.0938 (95.0918) acc5: 100.0000 (99.8442) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [1] [370/468] eta: 0:00:03 lr: 0.002 img/s: 3390.620891751926 loss: 0.5937 (0.6220) acc1: 96.0938 (95.0998) acc5: 100.0000 (99.8484) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [1] [380/468] eta: 0:00:03 lr: 0.002 img/s: 3559.184253618048 loss: 0.5937 (0.6215) acc1: 95.3125 (95.1013) acc5: 100.0000 (99.8483) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [1] [390/468] eta: 0:00:03 lr: 0.002 img/s: 3451.502838370396 loss: 0.6005 (0.6211) acc1: 96.0938 (95.1247) acc5: 100.0000 (99.8461) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [1] [400/468] eta: 0:00:02 lr: 0.002 img/s: 3515.7389214498544 loss: 0.6005 (0.6209) acc1: 95.3125 (95.1391) acc5: 100.0000 (99.8402) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [1] [410/468] eta: 0:00:02 lr: 0.002 img/s: 3433.4687779795863 loss: 0.6419 (0.6247) acc1: 94.5312 (94.9856) acc5: 100.0000 (99.8270) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [1] [420/468] eta: 0:00:01 lr: 0.002 img/s: 3529.1896162973385 loss: 0.6978 (0.6257) acc1: 90.6250 (94.9358) acc5: 100.0000 (99.8256) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [1] [430/468] eta: 0:00:01 lr: 0.002 img/s: 3377.545010160236 loss: 0.6389 (0.6257) acc1: 94.5312 (94.9355) acc5: 100.0000 (99.8242) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [1] [440/468] eta: 0:00:01 lr: 0.002 img/s: 3556.0723573089226 loss: 0.6291 (0.6259) acc1: 94.5312 (94.9157) acc5: 100.0000 (99.8228) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [1] [450/468] eta: 0:00:00 lr: 0.002 img/s: 3517.881372369145 loss: 0.6064 (0.6255) acc1: 95.3125 (94.9349) acc5: 100.0000 (99.8268) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [1] [460/468] eta: 0:00:00 lr: 0.002 img/s: 3552.3546592029434 loss: 0.6011 (0.6259) acc1: 94.5312 (94.9109) acc5: 100.0000 (99.8288) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [1] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5358 (0.5358) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1236 data: 0.0995 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 96.310 Acc@5 99.900
Epoch: [2] [ 0/468] eta: 0:01:13 lr: 0.0019781476007338056 img/s: 2821.315311811909 loss: 0.5854 (0.5854) acc1: 96.0938 (96.0938) acc5: 100.0000 (100.0000) time: 0.1570 data: 0.1116 max mem: 190
Epoch: [2] [ 10/468] eta: 0:00:22 lr: 0.0019781476007338056 img/s: 3476.2426314426316 loss: 0.6098 (0.6028) acc1: 96.0938 (95.8097) acc5: 100.0000 (99.9290) time: 0.0488 data: 0.0105 max mem: 190
Epoch: [2] [ 20/468] eta: 0:00:20 lr: 0.0019781476007338056 img/s: 3316.966389877422 loss: 0.6098 (0.6142) acc1: 95.3125 (95.2381) acc5: 100.0000 (99.8140) time: 0.0404 data: 0.0008 max mem: 190
Epoch: [2] [ 30/468] eta: 0:00:19 lr: 0.0019781476007338056 img/s: 3366.1095596671953 loss: 0.6261 (0.6167) acc1: 94.5312 (95.1613) acc5: 100.0000 (99.8236) time: 0.0425 data: 0.0007 max mem: 190
Epoch: [2] [ 40/468] eta: 0:00:18 lr: 0.0019781476007338056 img/s: 3201.510569910491 loss: 0.6153 (0.6128) acc1: 95.3125 (95.4649) acc5: 100.0000 (99.8095) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [2] [ 50/468] eta: 0:00:17 lr: 0.0019781476007338056 img/s: 3468.8081875803605 loss: 0.5954 (0.6109) acc1: 96.0938 (95.6342) acc5: 100.0000 (99.8315) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [2] [ 60/468] eta: 0:00:16 lr: 0.0019781476007338056 img/s: 3455.4570860338936 loss: 0.5947 (0.6105) acc1: 96.0938 (95.6327) acc5: 100.0000 (99.8335) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [2] [ 70/468] eta: 0:00:17 lr: 0.0019781476007338056 img/s: 2209.2453098830915 loss: 0.6161 (0.6150) acc1: 95.3125 (95.4445) acc5: 100.0000 (99.8129) time: 0.0469 data: 0.0014 max mem: 190
Epoch: [2] [ 80/468] eta: 0:00:17 lr: 0.0019781476007338056 img/s: 3254.7888546693503 loss: 0.6118 (0.6127) acc1: 95.3125 (95.4572) acc5: 100.0000 (99.8264) time: 0.0529 data: 0.0034 max mem: 190
Epoch: [2] [ 90/468] eta: 0:00:16 lr: 0.0019781476007338056 img/s: 3281.928012519562 loss: 0.6002 (0.6121) acc1: 96.0938 (95.5100) acc5: 100.0000 (99.7940) time: 0.0450 data: 0.0025 max mem: 190
Epoch: [2] [100/468] eta: 0:00:15 lr: 0.0019781476007338056 img/s: 3343.969205663069 loss: 0.6085 (0.6118) acc1: 96.0938 (95.5523) acc5: 100.0000 (99.7912) time: 0.0400 data: 0.0005 max mem: 190
Epoch: [2] [110/468] eta: 0:00:15 lr: 0.0019781476007338056 img/s: 2319.247087284273 loss: 0.5869 (0.6091) acc1: 96.0938 (95.6503) acc5: 100.0000 (99.8029) time: 0.0431 data: 0.0003 max mem: 190
Epoch: [2] [120/468] eta: 0:00:15 lr: 0.0019781476007338056 img/s: 2519.728123078656 loss: 0.5794 (0.6075) acc1: 96.0938 (95.6999) acc5: 100.0000 (99.8063) time: 0.0471 data: 0.0007 max mem: 190
Epoch: [2] [130/468] eta: 0:00:14 lr: 0.0019781476007338056 img/s: 2833.570376000169 loss: 0.5886 (0.6080) acc1: 96.0938 (95.6763) acc5: 100.0000 (99.8151) time: 0.0471 data: 0.0007 max mem: 190
Epoch: [2] [140/468] eta: 0:00:14 lr: 0.0019781476007338056 img/s: 3309.3399577140954 loss: 0.5939 (0.6070) acc1: 96.0938 (95.7170) acc5: 100.0000 (99.8227) time: 0.0441 data: 0.0003 max mem: 190
Epoch: [2] [150/468] eta: 0:00:13 lr: 0.0019781476007338056 img/s: 3178.707086017431 loss: 0.6062 (0.6089) acc1: 96.0938 (95.6178) acc5: 100.0000 (99.8241) time: 0.0411 data: 0.0003 max mem: 190
Epoch: [2] [160/468] eta: 0:00:13 lr: 0.0019781476007338056 img/s: 3495.025792591628 loss: 0.6187 (0.6087) acc1: 95.3125 (95.6619) acc5: 100.0000 (99.8205) time: 0.0391 data: 0.0003 max mem: 190
Epoch: [2] [170/468] eta: 0:00:12 lr: 0.0019781476007338056 img/s: 3553.836100299203 loss: 0.6058 (0.6102) acc1: 96.0938 (95.6232) acc5: 100.0000 (99.8173) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [2] [180/468] eta: 0:00:12 lr: 0.0019781476007338056 img/s: 3498.419220518568 loss: 0.6069 (0.6115) acc1: 95.3125 (95.5413) acc5: 100.0000 (99.8187) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [2] [190/468] eta: 0:00:11 lr: 0.0019781476007338056 img/s: 3494.0022257655137 loss: 0.6069 (0.6111) acc1: 95.3125 (95.5334) acc5: 100.0000 (99.8282) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [2] [200/468] eta: 0:00:11 lr: 0.0019781476007338056 img/s: 3423.7469516861383 loss: 0.5874 (0.6098) acc1: 96.0938 (95.5729) acc5: 100.0000 (99.8368) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [2] [210/468] eta: 0:00:10 lr: 0.0019781476007338056 img/s: 3504.1963343950706 loss: 0.5874 (0.6095) acc1: 96.0938 (95.5791) acc5: 100.0000 (99.8408) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [2] [220/468] eta: 0:00:10 lr: 0.0019781476007338056 img/s: 3503.6932193434704 loss: 0.5977 (0.6110) acc1: 95.3125 (95.5246) acc5: 100.0000 (99.8409) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [2] [230/468] eta: 0:00:09 lr: 0.0019781476007338056 img/s: 3421.2596831546884 loss: 0.6120 (0.6113) acc1: 94.5312 (95.4917) acc5: 100.0000 (99.8410) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [2] [240/468] eta: 0:00:09 lr: 0.0019781476007338056 img/s: 3483.8672567520216 loss: 0.6175 (0.6122) acc1: 94.5312 (95.4487) acc5: 100.0000 (99.8476) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [2] [250/468] eta: 0:00:09 lr: 0.0019781476007338056 img/s: 3258.5818543785963 loss: 0.6107 (0.6115) acc1: 95.3125 (95.4681) acc5: 100.0000 (99.8444) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [2] [260/468] eta: 0:00:08 lr: 0.0019781476007338056 img/s: 1981.9730431155099 loss: 0.6107 (0.6116) acc1: 95.3125 (95.4622) acc5: 100.0000 (99.8443) time: 0.0385 data: 0.0002 max mem: 190
Epoch: [2] [270/468] eta: 0:00:08 lr: 0.0019781476007338056 img/s: 3555.4600494043007 loss: 0.6134 (0.6117) acc1: 95.3125 (95.4595) acc5: 100.0000 (99.8443) time: 0.0386 data: 0.0002 max mem: 190
Epoch: [2] [280/468] eta: 0:00:07 lr: 0.0019781476007338056 img/s: 3548.410181164449 loss: 0.5971 (0.6111) acc1: 96.0938 (95.4932) acc5: 100.0000 (99.8443) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [2] [290/468] eta: 0:00:07 lr: 0.0019781476007338056 img/s: 3422.0447458664253 loss: 0.5899 (0.6107) acc1: 96.0938 (95.4924) acc5: 100.0000 (99.8416) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [2] [300/468] eta: 0:00:06 lr: 0.0019781476007338056 img/s: 3448.1997739184053 loss: 0.6042 (0.6106) acc1: 96.0938 (95.5124) acc5: 100.0000 (99.8443) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [2] [310/468] eta: 0:00:06 lr: 0.0019781476007338056 img/s: 3380.6714607760414 loss: 0.6186 (0.6109) acc1: 95.3125 (95.5110) acc5: 100.0000 (99.8468) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [2] [320/468] eta: 0:00:06 lr: 0.0019781476007338056 img/s: 3389.9786070594178 loss: 0.6108 (0.6104) acc1: 95.3125 (95.5242) acc5: 100.0000 (99.8515) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [2] [330/468] eta: 0:00:05 lr: 0.0019781476007338056 img/s: 3506.7369837423335 loss: 0.5978 (0.6102) acc1: 96.0938 (95.5249) acc5: 100.0000 (99.8513) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [2] [340/468] eta: 0:00:05 lr: 0.0019781476007338056 img/s: 1529.856215929718 loss: 0.6067 (0.6103) acc1: 95.3125 (95.5187) acc5: 100.0000 (99.8534) time: 0.0402 data: 0.0006 max mem: 190
Epoch: [2] [350/468] eta: 0:00:04 lr: 0.0019781476007338056 img/s: 2602.5688343796205 loss: 0.6111 (0.6111) acc1: 94.5312 (95.4950) acc5: 100.0000 (99.8397) time: 0.0504 data: 0.0036 max mem: 190
Epoch: [2] [360/468] eta: 0:00:04 lr: 0.0019781476007338056 img/s: 3450.659845100749 loss: 0.6074 (0.6105) acc1: 95.3125 (95.5181) acc5: 100.0000 (99.8377) time: 0.0477 data: 0.0034 max mem: 190
Epoch: [2] [370/468] eta: 0:00:04 lr: 0.0019781476007338056 img/s: 3490.935119318551 loss: 0.6074 (0.6108) acc1: 95.3125 (95.5041) acc5: 100.0000 (99.8357) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [2] [380/468] eta: 0:00:03 lr: 0.0019781476007338056 img/s: 3499.1032581420964 loss: 0.5872 (0.6100) acc1: 96.0938 (95.5504) acc5: 100.0000 (99.8380) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [2] [390/468] eta: 0:00:03 lr: 0.0019781476007338056 img/s: 3348.140069473461 loss: 0.5739 (0.6090) acc1: 96.8750 (95.5942) acc5: 100.0000 (99.8362) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [2] [400/468] eta: 0:00:02 lr: 0.0019781476007338056 img/s: 3369.2571543327645 loss: 0.5725 (0.6083) acc1: 96.8750 (95.6242) acc5: 100.0000 (99.8363) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [2] [410/468] eta: 0:00:02 lr: 0.0019781476007338056 img/s: 3479.306511820821 loss: 0.5725 (0.6077) acc1: 96.8750 (95.6490) acc5: 100.0000 (99.8384) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [2] [420/468] eta: 0:00:01 lr: 0.0019781476007338056 img/s: 3559.3966267105125 loss: 0.5968 (0.6074) acc1: 96.0938 (95.6632) acc5: 100.0000 (99.8404) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [2] [430/468] eta: 0:00:01 lr: 0.0019781476007338056 img/s: 2184.8355973726834 loss: 0.5968 (0.6067) acc1: 96.0938 (95.6805) acc5: 100.0000 (99.8423) time: 0.0443 data: 0.0014 max mem: 190
Epoch: [2] [440/468] eta: 0:00:01 lr: 0.0019781476007338056 img/s: 2506.7746442045495 loss: 0.5753 (0.6061) acc1: 96.0938 (95.7022) acc5: 100.0000 (99.8441) time: 0.0546 data: 0.0031 max mem: 190
Epoch: [2] [450/468] eta: 0:00:00 lr: 0.0019781476007338056 img/s: 3444.5714872321314 loss: 0.5954 (0.6062) acc1: 96.0938 (95.7023) acc5: 100.0000 (99.8424) time: 0.0545 data: 0.0039 max mem: 190
Epoch: [2] [460/468] eta: 0:00:00 lr: 0.0019781476007338056 img/s: 3485.7446938364747 loss: 0.5979 (0.6060) acc1: 95.3125 (95.7074) acc5: 100.0000 (99.8424) time: 0.0444 data: 0.0021 max mem: 190
Epoch: [2] Total time: 0:00:19
Test: [ 0/79] eta: 0:00:09 loss: 0.5610 (0.5610) acc1: 96.8750 (96.8750) acc5: 100.0000 (100.0000) time: 0.1196 data: 0.0968 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 96.960 Acc@5 99.950
Epoch: [3] [ 0/468] eta: 0:01:21 lr: 0.001913545457642601 img/s: 2668.437331318684 loss: 0.6060 (0.6060) acc1: 94.5312 (94.5312) acc5: 100.0000 (100.0000) time: 0.1742 data: 0.1262 max mem: 190
Epoch: [3] [ 10/468] eta: 0:00:23 lr: 0.001913545457642601 img/s: 3215.315721703509 loss: 0.6060 (0.5948) acc1: 95.3125 (96.0227) acc5: 100.0000 (99.9290) time: 0.0504 data: 0.0117 max mem: 190
Epoch: [3] [ 20/468] eta: 0:00:19 lr: 0.001913545457642601 img/s: 3411.194916923468 loss: 0.5929 (0.5942) acc1: 96.0938 (96.0193) acc5: 100.0000 (99.9256) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [3] [ 30/468] eta: 0:00:18 lr: 0.001913545457642601 img/s: 3474.3529289948488 loss: 0.5953 (0.5982) acc1: 96.0938 (96.1190) acc5: 100.0000 (99.8236) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [3] [ 40/468] eta: 0:00:17 lr: 0.001913545457642601 img/s: 3487.4429143065013 loss: 0.6071 (0.5976) acc1: 95.3125 (96.0938) acc5: 100.0000 (99.8666) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [3] [ 50/468] eta: 0:00:16 lr: 0.001913545457642601 img/s: 3424.161848088833 loss: 0.5653 (0.5907) acc1: 97.6562 (96.3848) acc5: 100.0000 (99.8928) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [3] [ 60/468] eta: 0:00:16 lr: 0.001913545457642601 img/s: 3444.3946929453127 loss: 0.5667 (0.5880) acc1: 96.8750 (96.4267) acc5: 100.0000 (99.9103) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [3] [ 70/468] eta: 0:00:15 lr: 0.001913545457642601 img/s: 3553.5538257876624 loss: 0.5666 (0.5839) acc1: 96.8750 (96.6329) acc5: 100.0000 (99.9120) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [3] [ 80/468] eta: 0:00:15 lr: 0.001913545457642601 img/s: 3473.8358687001364 loss: 0.5538 (0.5815) acc1: 97.6562 (96.7207) acc5: 100.0000 (99.9228) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [3] [ 90/468] eta: 0:00:14 lr: 0.001913545457642601 img/s: 3484.319466258226 loss: 0.5678 (0.5810) acc1: 96.8750 (96.7119) acc5: 100.0000 (99.9227) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [3] [100/468] eta: 0:00:14 lr: 0.001913545457642601 img/s: 3542.579987858632 loss: 0.5647 (0.5788) acc1: 96.8750 (96.7899) acc5: 100.0000 (99.9149) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [3] [110/468] eta: 0:00:14 lr: 0.001913545457642601 img/s: 2912.3950960182274 loss: 0.5647 (0.5789) acc1: 96.8750 (96.7483) acc5: 100.0000 (99.9226) time: 0.0464 data: 0.0015 max mem: 190
Epoch: [3] [120/468] eta: 0:00:14 lr: 0.001913545457642601 img/s: 3437.117709573746 loss: 0.5790 (0.5800) acc1: 96.0938 (96.7265) acc5: 100.0000 (99.9290) time: 0.0512 data: 0.0029 max mem: 190
Epoch: [3] [130/468] eta: 0:00:13 lr: 0.001913545457642601 img/s: 3497.894972765891 loss: 0.6027 (0.5813) acc1: 96.0938 (96.6961) acc5: 100.0000 (99.9165) time: 0.0423 data: 0.0016 max mem: 190
Epoch: [3] [140/468] eta: 0:00:13 lr: 0.001913545457642601 img/s: 3335.4720610345557 loss: 0.5973 (0.5822) acc1: 96.0938 (96.6201) acc5: 100.0000 (99.9113) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [3] [150/468] eta: 0:00:12 lr: 0.001913545457642601 img/s: 3523.792381002389 loss: 0.5937 (0.5826) acc1: 96.0938 (96.6215) acc5: 100.0000 (99.9120) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [3] [160/468] eta: 0:00:12 lr: 0.001913545457642601 img/s: 3514.9793240712856 loss: 0.6063 (0.5881) acc1: 96.0938 (96.4431) acc5: 100.0000 (99.8932) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [3] [170/468] eta: 0:00:11 lr: 0.001913545457642601 img/s: 3534.068262755656 loss: 0.6085 (0.5878) acc1: 96.0938 (96.4638) acc5: 100.0000 (99.8995) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [3] [180/468] eta: 0:00:11 lr: 0.001913545457642601 img/s: 3485.1111154387945 loss: 0.5896 (0.5887) acc1: 96.8750 (96.4132) acc5: 100.0000 (99.9050) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [3] [190/468] eta: 0:00:10 lr: 0.001913545457642601 img/s: 3553.4832641660546 loss: 0.5913 (0.5890) acc1: 95.3125 (96.3719) acc5: 100.0000 (99.9059) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [3] [200/468] eta: 0:00:10 lr: 0.001913545457642601 img/s: 3421.3905019245967 loss: 0.5847 (0.5893) acc1: 95.3125 (96.3386) acc5: 100.0000 (99.9067) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [3] [210/468] eta: 0:00:10 lr: 0.001913545457642601 img/s: 3595.2701922626184 loss: 0.5847 (0.5894) acc1: 96.0938 (96.3344) acc5: 100.0000 (99.9111) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [3] [220/468] eta: 0:00:09 lr: 0.001913545457642601 img/s: 3537.258275353152 loss: 0.5805 (0.5894) acc1: 96.8750 (96.3518) acc5: 100.0000 (99.9152) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [3] [230/468] eta: 0:00:09 lr: 0.001913545457642601 img/s: 3386.7281008314303 loss: 0.5732 (0.5886) acc1: 96.8750 (96.3846) acc5: 100.0000 (99.9154) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [3] [240/468] eta: 0:00:08 lr: 0.001913545457642601 img/s: 3616.8023821393444 loss: 0.5657 (0.5874) acc1: 96.8750 (96.4341) acc5: 100.0000 (99.9157) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [3] [250/468] eta: 0:00:08 lr: 0.001913545457642601 img/s: 3618.533177863002 loss: 0.5598 (0.5867) acc1: 96.8750 (96.4704) acc5: 100.0000 (99.9128) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [3] [260/468] eta: 0:00:08 lr: 0.001913545457642601 img/s: 3545.551224731048 loss: 0.5767 (0.5865) acc1: 96.8750 (96.4559) acc5: 100.0000 (99.9162) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [3] [270/468] eta: 0:00:07 lr: 0.001913545457642601 img/s: 3522.613213303851 loss: 0.5849 (0.5870) acc1: 96.0938 (96.4310) acc5: 100.0000 (99.9193) time: 0.0415 data: 0.0003 max mem: 190
Epoch: [3] [280/468] eta: 0:00:07 lr: 0.001913545457642601 img/s: 3410.609813737199 loss: 0.5820 (0.5864) acc1: 96.8750 (96.4552) acc5: 100.0000 (99.9166) time: 0.0389 data: 0.0003 max mem: 190
Epoch: [3] [290/468] eta: 0:00:06 lr: 0.001913545457642601 img/s: 3503.327408219464 loss: 0.5733 (0.5860) acc1: 96.8750 (96.4696) acc5: 100.0000 (99.9168) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [3] [300/468] eta: 0:00:06 lr: 0.001913545457642601 img/s: 3625.130232212671 loss: 0.5620 (0.5857) acc1: 96.8750 (96.4805) acc5: 100.0000 (99.9195) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [3] [310/468] eta: 0:00:06 lr: 0.001913545457642601 img/s: 3454.3232016471497 loss: 0.5660 (0.5855) acc1: 96.8750 (96.4957) acc5: 100.0000 (99.9196) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [3] [320/468] eta: 0:00:05 lr: 0.001913545457642601 img/s: 3379.0330746524173 loss: 0.6028 (0.5871) acc1: 96.0938 (96.4393) acc5: 100.0000 (99.9124) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [3] [330/468] eta: 0:00:05 lr: 0.001913545457642601 img/s: 3490.503884688152 loss: 0.6150 (0.5877) acc1: 95.3125 (96.4171) acc5: 100.0000 (99.9103) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [3] [340/468] eta: 0:00:04 lr: 0.001913545457642601 img/s: 3462.0081379977432 loss: 0.5826 (0.5875) acc1: 96.0938 (96.4214) acc5: 100.0000 (99.9084) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [3] [350/468] eta: 0:00:04 lr: 0.001913545457642601 img/s: 3005.1716606306222 loss: 0.5865 (0.5878) acc1: 96.0938 (96.4009) acc5: 100.0000 (99.9110) time: 0.0377 data: 0.0002 max mem: 190
Epoch: [3] [360/468] eta: 0:00:04 lr: 0.001913545457642601 img/s: 3514.818239549576 loss: 0.5899 (0.5879) acc1: 96.0938 (96.4140) acc5: 100.0000 (99.9134) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [3] [370/468] eta: 0:00:03 lr: 0.001913545457642601 img/s: 3496.049959300622 loss: 0.5930 (0.5880) acc1: 96.0938 (96.3907) acc5: 100.0000 (99.9137) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [3] [380/468] eta: 0:00:03 lr: 0.001913545457642601 img/s: 3580.0702315935478 loss: 0.6009 (0.5906) acc1: 94.5312 (96.2803) acc5: 100.0000 (99.9118) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [3] [390/468] eta: 0:00:03 lr: 0.001913545457642601 img/s: 2281.0239118981663 loss: 0.6759 (0.5998) acc1: 92.1875 (95.9359) acc5: 100.0000 (99.8441) time: 0.0419 data: 0.0004 max mem: 190
Epoch: [3] [400/468] eta: 0:00:02 lr: 0.001913545457642601 img/s: 3228.949118290952 loss: 0.8094 (0.6080) acc1: 86.7188 (95.6223) acc5: 99.2188 (99.8208) time: 0.0525 data: 0.0031 max mem: 190
Epoch: [3] [410/468] eta: 0:00:02 lr: 0.001913545457642601 img/s: 3199.450015196572 loss: 0.7277 (0.6102) acc1: 89.0625 (95.5254) acc5: 100.0000 (99.8213) time: 0.0491 data: 0.0029 max mem: 190
Epoch: [3] [420/468] eta: 0:00:01 lr: 0.001913545457642601 img/s: 3448.9086237023334 loss: 0.6970 (0.6125) acc1: 91.4062 (95.4313) acc5: 100.0000 (99.8126) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [3] [430/468] eta: 0:00:01 lr: 0.001913545457642601 img/s: 3289.6703533722634 loss: 0.6822 (0.6141) acc1: 92.1875 (95.3705) acc5: 100.0000 (99.8133) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [3] [440/468] eta: 0:00:01 lr: 0.001913545457642601 img/s: 3540.1505552185267 loss: 0.6592 (0.6149) acc1: 93.7500 (95.3426) acc5: 100.0000 (99.8104) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [3] [450/468] eta: 0:00:00 lr: 0.001913545457642601 img/s: 3581.431529512221 loss: 0.6415 (0.6156) acc1: 93.7500 (95.3090) acc5: 100.0000 (99.8095) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [3] [460/468] eta: 0:00:00 lr: 0.001913545457642601 img/s: 3241.6413290906125 loss: 0.6232 (0.6155) acc1: 95.3125 (95.3125) acc5: 100.0000 (99.8119) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [3] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5786 (0.5786) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.1213 data: 0.0950 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 95.390 Acc@5 99.820
Epoch: [4] [ 0/468] eta: 0:01:13 lr: 0.0018090169943749475 img/s: 2566.57445811701 loss: 0.6653 (0.6653) acc1: 92.1875 (92.1875) acc5: 100.0000 (100.0000) time: 0.1563 data: 0.1064 max mem: 190
Epoch: [4] [ 10/468] eta: 0:00:24 lr: 0.0018090169943749475 img/s: 3494.7982814737666 loss: 0.6109 (0.6174) acc1: 96.0938 (95.3125) acc5: 100.0000 (99.8580) time: 0.0527 data: 0.0101 max mem: 190
Epoch: [4] [ 20/468] eta: 0:00:21 lr: 0.0018090169943749475 img/s: 3517.005646904684 loss: 0.6020 (0.6088) acc1: 96.0938 (95.7217) acc5: 100.0000 (99.9256) time: 0.0418 data: 0.0006 max mem: 190
Epoch: [4] [ 30/468] eta: 0:00:19 lr: 0.0018090169943749475 img/s: 3463.392825117893 loss: 0.5943 (0.6025) acc1: 96.0938 (95.9173) acc5: 100.0000 (99.9244) time: 0.0390 data: 0.0005 max mem: 190
Epoch: [4] [ 40/468] eta: 0:00:18 lr: 0.0018090169943749475 img/s: 3438.5706453513694 loss: 0.5790 (0.5999) acc1: 96.0938 (95.9794) acc5: 100.0000 (99.9428) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [4] [ 50/468] eta: 0:00:17 lr: 0.0018090169943749475 img/s: 3453.900964365442 loss: 0.5844 (0.5983) acc1: 96.0938 (96.0478) acc5: 100.0000 (99.9081) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [4] [ 60/468] eta: 0:00:16 lr: 0.0018090169943749475 img/s: 3443.908602219514 loss: 0.6074 (0.6035) acc1: 95.3125 (95.7992) acc5: 100.0000 (99.8975) time: 0.0384 data: 0.0004 max mem: 190
Epoch: [4] [ 70/468] eta: 0:00:16 lr: 0.0018090169943749475 img/s: 3476.0850777289297 loss: 0.6247 (0.6063) acc1: 94.5312 (95.7416) acc5: 100.0000 (99.9010) time: 0.0386 data: 0.0004 max mem: 190
Epoch: [4] [ 80/468] eta: 0:00:15 lr: 0.0018090169943749475 img/s: 3564.5721949632502 loss: 0.6247 (0.6091) acc1: 94.5312 (95.6019) acc5: 100.0000 (99.8939) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [4] [ 90/468] eta: 0:00:14 lr: 0.0018090169943749475 img/s: 3499.81037809648 loss: 0.6081 (0.6080) acc1: 95.3125 (95.6731) acc5: 100.0000 (99.8884) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [4] [100/468] eta: 0:00:14 lr: 0.0018090169943749475 img/s: 3433.578572387902 loss: 0.6030 (0.6081) acc1: 95.3125 (95.6142) acc5: 100.0000 (99.8917) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [4] [110/468] eta: 0:00:14 lr: 0.0018090169943749475 img/s: 3602.9912151778103 loss: 0.6003 (0.6076) acc1: 94.5312 (95.6011) acc5: 100.0000 (99.8944) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [4] [120/468] eta: 0:00:13 lr: 0.0018090169943749475 img/s: 3575.6114766763462 loss: 0.5878 (0.6064) acc1: 96.0938 (95.6353) acc5: 100.0000 (99.9032) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [4] [130/468] eta: 0:00:13 lr: 0.0018090169943749475 img/s: 3571.3301048374224 loss: 0.5812 (0.6045) acc1: 96.8750 (95.7180) acc5: 100.0000 (99.9046) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [4] [140/468] eta: 0:00:12 lr: 0.0018090169943749475 img/s: 3436.017817828068 loss: 0.5775 (0.6025) acc1: 96.8750 (95.7890) acc5: 100.0000 (99.9058) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [4] [150/468] eta: 0:00:12 lr: 0.0018090169943749475 img/s: 3567.675283422602 loss: 0.5696 (0.6018) acc1: 96.8750 (95.8506) acc5: 100.0000 (99.9069) time: 0.0385 data: 0.0003 max mem: 190
Epoch: [4] [160/468] eta: 0:00:11 lr: 0.0018090169943749475 img/s: 3576.2307456601966 loss: 0.5856 (0.6003) acc1: 96.8750 (95.9045) acc5: 100.0000 (99.9078) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [4] [170/468] eta: 0:00:11 lr: 0.0018090169943749475 img/s: 2341.4799443492216 loss: 0.5834 (0.5991) acc1: 96.8750 (95.9430) acc5: 100.0000 (99.9086) time: 0.0456 data: 0.0003 max mem: 190
Epoch: [4] [180/468] eta: 0:00:11 lr: 0.0018090169943749475 img/s: 3425.844300372658 loss: 0.5871 (0.5986) acc1: 96.0938 (95.9384) acc5: 100.0000 (99.9007) time: 0.0495 data: 0.0010 max mem: 190
Epoch: [4] [190/468] eta: 0:00:11 lr: 0.0018090169943749475 img/s: 3642.025045790652 loss: 0.5852 (0.5970) acc1: 96.0938 (96.0079) acc5: 100.0000 (99.9059) time: 0.0403 data: 0.0009 max mem: 190
Epoch: [4] [200/468] eta: 0:00:10 lr: 0.0018090169943749475 img/s: 3513.277177185038 loss: 0.5691 (0.5967) acc1: 96.8750 (96.0199) acc5: 100.0000 (99.9028) time: 0.0365 data: 0.0003 max mem: 190
Epoch: [4] [210/468] eta: 0:00:10 lr: 0.0018090169943749475 img/s: 3409.310302783987 loss: 0.5779 (0.5966) acc1: 96.0938 (96.0086) acc5: 100.0000 (99.9000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [4] [220/468] eta: 0:00:09 lr: 0.0018090169943749475 img/s: 2335.2163617541387 loss: 0.5842 (0.5966) acc1: 96.0938 (96.0230) acc5: 100.0000 (99.9010) time: 0.0430 data: 0.0008 max mem: 190
Epoch: [4] [230/468] eta: 0:00:09 lr: 0.0018090169943749475 img/s: 2321.1723311989554 loss: 0.5957 (0.5973) acc1: 96.0938 (96.0058) acc5: 100.0000 (99.8985) time: 0.0547 data: 0.0040 max mem: 190
Epoch: [4] [240/468] eta: 0:00:09 lr: 0.0018090169943749475 img/s: 3229.8426923030647 loss: 0.5963 (0.5970) acc1: 96.0938 (96.0095) acc5: 100.0000 (99.8930) time: 0.0570 data: 0.0054 max mem: 190
Epoch: [4] [250/468] eta: 0:00:08 lr: 0.0018090169943749475 img/s: 3209.798589023078 loss: 0.5701 (0.5961) acc1: 96.8750 (96.0471) acc5: 100.0000 (99.8911) time: 0.0471 data: 0.0021 max mem: 190
Epoch: [4] [260/468] eta: 0:00:08 lr: 0.0018090169943749475 img/s: 3546.6755101636354 loss: 0.5701 (0.6003) acc1: 96.8750 (95.8752) acc5: 100.0000 (99.8743) time: 0.0393 data: 0.0003 max mem: 190
Epoch: [4] [270/468] eta: 0:00:08 lr: 0.0018090169943749475 img/s: 3492.1840309623703 loss: 0.6600 (0.6041) acc1: 92.9688 (95.7074) acc5: 100.0000 (99.8645) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [4] [280/468] eta: 0:00:07 lr: 0.0018090169943749475 img/s: 3504.7453519949863 loss: 0.6688 (0.6056) acc1: 92.1875 (95.6350) acc5: 100.0000 (99.8638) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [4] [290/468] eta: 0:00:07 lr: 0.0018090169943749475 img/s: 3551.9786168432056 loss: 0.6537 (0.6070) acc1: 93.7500 (95.5944) acc5: 100.0000 (99.8523) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [4] [300/468] eta: 0:00:06 lr: 0.0018090169943749475 img/s: 3557.674775521023 loss: 0.6653 (0.6094) acc1: 93.7500 (95.5046) acc5: 99.2188 (99.8391) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [4] [310/468] eta: 0:00:06 lr: 0.0018090169943749475 img/s: 3551.7906255168537 loss: 0.6758 (0.6117) acc1: 92.9688 (95.4130) acc5: 100.0000 (99.8342) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [4] [320/468] eta: 0:00:05 lr: 0.0018090169943749475 img/s: 3595.39057874928 loss: 0.6588 (0.6142) acc1: 92.9688 (95.3149) acc5: 100.0000 (99.8248) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [4] [330/468] eta: 0:00:05 lr: 0.0018090169943749475 img/s: 3503.7846840614516 loss: 0.6509 (0.6150) acc1: 93.7500 (95.2960) acc5: 100.0000 (99.8230) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [4] [340/468] eta: 0:00:05 lr: 0.0018090169943749475 img/s: 3458.952349045177 loss: 0.6186 (0.6149) acc1: 95.3125 (95.3033) acc5: 100.0000 (99.8236) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [4] [350/468] eta: 0:00:04 lr: 0.0018090169943749475 img/s: 3440.5318533993836 loss: 0.5989 (0.6143) acc1: 96.0938 (95.3281) acc5: 100.0000 (99.8264) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [4] [360/468] eta: 0:00:04 lr: 0.0018090169943749475 img/s: 3385.6815685087436 loss: 0.5974 (0.6144) acc1: 96.0938 (95.3276) acc5: 100.0000 (99.8247) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [4] [370/468] eta: 0:00:03 lr: 0.0018090169943749475 img/s: 3553.3421493292035 loss: 0.5979 (0.6139) acc1: 96.0938 (95.3651) acc5: 100.0000 (99.8273) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [4] [380/468] eta: 0:00:03 lr: 0.0018090169943749475 img/s: 3550.9214244139903 loss: 0.5998 (0.6138) acc1: 96.0938 (95.3494) acc5: 100.0000 (99.8319) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [4] [390/468] eta: 0:00:03 lr: 0.0018090169943749475 img/s: 3523.838638959266 loss: 0.6247 (0.6144) acc1: 94.5312 (95.3165) acc5: 100.0000 (99.8342) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [4] [400/468] eta: 0:00:02 lr: 0.0018090169943749475 img/s: 3421.870256351422 loss: 0.6127 (0.6145) acc1: 94.5312 (95.3281) acc5: 100.0000 (99.8266) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [4] [410/468] eta: 0:00:02 lr: 0.0018090169943749475 img/s: 3300.7132484491526 loss: 0.6100 (0.6147) acc1: 94.5312 (95.3125) acc5: 100.0000 (99.8251) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [4] [420/468] eta: 0:00:01 lr: 0.0018090169943749475 img/s: 3615.414067813731 loss: 0.6168 (0.6146) acc1: 95.3125 (95.3106) acc5: 100.0000 (99.8200) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [4] [430/468] eta: 0:00:01 lr: 0.0018090169943749475 img/s: 3585.21036955912 loss: 0.5991 (0.6142) acc1: 95.3125 (95.3252) acc5: 100.0000 (99.8187) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [4] [440/468] eta: 0:00:01 lr: 0.0018090169943749475 img/s: 3062.4962893243205 loss: 0.5911 (0.6138) acc1: 95.3125 (95.3444) acc5: 100.0000 (99.8211) time: 0.0415 data: 0.0007 max mem: 190
Epoch: [4] [450/468] eta: 0:00:00 lr: 0.0018090169943749475 img/s: 2291.2944530513128 loss: 0.5909 (0.6136) acc1: 95.3125 (95.3419) acc5: 100.0000 (99.8233) time: 0.0487 data: 0.0029 max mem: 190
Epoch: [4] [460/468] eta: 0:00:00 lr: 0.0018090169943749475 img/s: 3341.2845069020027 loss: 0.5903 (0.6131) acc1: 95.3125 (95.3498) acc5: 100.0000 (99.8271) time: 0.0447 data: 0.0024 max mem: 190
Epoch: [4] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.6161 (0.6161) acc1: 95.3125 (95.3125) acc5: 100.0000 (100.0000) time: 0.1243 data: 0.0979 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 95.730 Acc@5 99.930
Epoch: [5] [ 0/468] eta: 0:01:14 lr: 0.0016691306063588583 img/s: 2564.9313560617643 loss: 0.6228 (0.6228) acc1: 96.0938 (96.0938) acc5: 99.2188 (99.2188) time: 0.1600 data: 0.1101 max mem: 190
Epoch: [5] [ 10/468] eta: 0:00:24 lr: 0.0016691306063588583 img/s: 3533.300725257657 loss: 0.6159 (0.6111) acc1: 96.0938 (95.5966) acc5: 100.0000 (99.7869) time: 0.0539 data: 0.0111 max mem: 190
Epoch: [5] [ 20/468] eta: 0:00:21 lr: 0.0016691306063588583 img/s: 3536.978628086542 loss: 0.5970 (0.5997) acc1: 96.0938 (95.9077) acc5: 100.0000 (99.8140) time: 0.0420 data: 0.0008 max mem: 190
Epoch: [5] [ 30/468] eta: 0:00:19 lr: 0.0016691306063588583 img/s: 3382.5245370749562 loss: 0.5810 (0.6025) acc1: 96.0938 (95.8417) acc5: 100.0000 (99.7984) time: 0.0390 data: 0.0004 max mem: 190
Epoch: [5] [ 40/468] eta: 0:00:18 lr: 0.0016691306063588583 img/s: 3408.271406805485 loss: 0.6031 (0.6021) acc1: 95.3125 (95.8460) acc5: 100.0000 (99.8285) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [5] [ 50/468] eta: 0:00:17 lr: 0.0016691306063588583 img/s: 3385.361330760597 loss: 0.5901 (0.6014) acc1: 96.0938 (95.8946) acc5: 100.0000 (99.8468) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [5] [ 60/468] eta: 0:00:16 lr: 0.0016691306063588583 img/s: 3415.5135444632474 loss: 0.5825 (0.5997) acc1: 96.8750 (96.0041) acc5: 100.0000 (99.8719) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [5] [ 70/468] eta: 0:00:16 lr: 0.0016691306063588583 img/s: 3336.840315242523 loss: 0.5662 (0.5967) acc1: 97.6562 (96.0827) acc5: 100.0000 (99.8790) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [5] [ 80/468] eta: 0:00:15 lr: 0.0016691306063588583 img/s: 3558.1935141798613 loss: 0.5666 (0.5931) acc1: 97.6562 (96.2288) acc5: 100.0000 (99.8939) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [5] [ 90/468] eta: 0:00:15 lr: 0.0016691306063588583 img/s: 3518.388570679599 loss: 0.5748 (0.5911) acc1: 96.8750 (96.2826) acc5: 100.0000 (99.9056) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [5] [100/468] eta: 0:00:14 lr: 0.0016691306063588583 img/s: 3507.9972295186944 loss: 0.5749 (0.5911) acc1: 96.0938 (96.2871) acc5: 100.0000 (99.9149) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [5] [110/468] eta: 0:00:14 lr: 0.0016691306063588583 img/s: 3445.898023106547 loss: 0.5652 (0.5887) acc1: 96.8750 (96.3612) acc5: 100.0000 (99.9155) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [5] [120/468] eta: 0:00:13 lr: 0.0016691306063588583 img/s: 3503.6703539101095 loss: 0.5570 (0.5868) acc1: 98.4375 (96.4941) acc5: 100.0000 (99.9161) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [5] [130/468] eta: 0:00:13 lr: 0.0016691306063588583 img/s: 3515.8770653376905 loss: 0.5673 (0.5860) acc1: 97.6562 (96.5649) acc5: 100.0000 (99.9105) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [5] [140/468] eta: 0:00:12 lr: 0.0016691306063588583 img/s: 3546.5115074646583 loss: 0.5682 (0.5845) acc1: 97.6562 (96.6312) acc5: 100.0000 (99.9169) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [5] [150/468] eta: 0:00:12 lr: 0.0016691306063588583 img/s: 3463.392825117893 loss: 0.5682 (0.5837) acc1: 96.8750 (96.6474) acc5: 100.0000 (99.9224) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [5] [160/468] eta: 0:00:11 lr: 0.0016691306063588583 img/s: 3087.843005532997 loss: 0.5680 (0.5827) acc1: 96.8750 (96.7052) acc5: 100.0000 (99.9175) time: 0.0391 data: 0.0003 max mem: 190
Epoch: [5] [170/468] eta: 0:00:11 lr: 0.0016691306063588583 img/s: 3348.808373410181 loss: 0.5629 (0.5817) acc1: 97.6562 (96.7608) acc5: 100.0000 (99.9132) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [5] [180/468] eta: 0:00:11 lr: 0.0016691306063588583 img/s: 3404.726617792547 loss: 0.5687 (0.5814) acc1: 97.6562 (96.7800) acc5: 100.0000 (99.9050) time: 0.0401 data: 0.0003 max mem: 190
Epoch: [5] [190/468] eta: 0:00:10 lr: 0.0016691306063588583 img/s: 3387.732525634958 loss: 0.5621 (0.5802) acc1: 97.6562 (96.8136) acc5: 100.0000 (99.9018) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [5] [200/468] eta: 0:00:10 lr: 0.0016691306063588583 img/s: 3349.100840283713 loss: 0.5525 (0.5793) acc1: 96.8750 (96.8595) acc5: 100.0000 (99.8989) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [5] [210/468] eta: 0:00:10 lr: 0.0016691306063588583 img/s: 3505.157228105454 loss: 0.5715 (0.5793) acc1: 96.8750 (96.8269) acc5: 100.0000 (99.9037) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [5] [220/468] eta: 0:00:09 lr: 0.0016691306063588583 img/s: 2838.7243924620884 loss: 0.5872 (0.5800) acc1: 96.8750 (96.8149) acc5: 100.0000 (99.9081) time: 0.0469 data: 0.0017 max mem: 190
Epoch: [5] [230/468] eta: 0:00:09 lr: 0.0016691306063588583 img/s: 3548.808926375907 loss: 0.5927 (0.5808) acc1: 96.0938 (96.7803) acc5: 100.0000 (99.9087) time: 0.0489 data: 0.0030 max mem: 190
Epoch: [5] [240/468] eta: 0:00:09 lr: 0.0016691306063588583 img/s: 3558.052024998509 loss: 0.5956 (0.5810) acc1: 96.0938 (96.7680) acc5: 100.0000 (99.9060) time: 0.0394 data: 0.0015 max mem: 190
Epoch: [5] [250/468] eta: 0:00:08 lr: 0.0016691306063588583 img/s: 3512.0361100571745 loss: 0.5814 (0.5812) acc1: 96.8750 (96.7661) acc5: 100.0000 (99.9035) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [5] [260/468] eta: 0:00:08 lr: 0.0016691306063588583 img/s: 3292.7977233139522 loss: 0.5690 (0.5806) acc1: 96.8750 (96.7762) acc5: 100.0000 (99.9072) time: 0.0388 data: 0.0003 max mem: 190
Epoch: [5] [270/468] eta: 0:00:07 lr: 0.0016691306063588583 img/s: 3578.6384039567793 loss: 0.5666 (0.5808) acc1: 96.8750 (96.7626) acc5: 100.0000 (99.9106) time: 0.0388 data: 0.0003 max mem: 190
Epoch: [5] [280/468] eta: 0:00:07 lr: 0.0016691306063588583 img/s: 3536.978628086542 loss: 0.5828 (0.5810) acc1: 96.0938 (96.7471) acc5: 100.0000 (99.9083) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [5] [290/468] eta: 0:00:06 lr: 0.0016691306063588583 img/s: 3547.8473992717563 loss: 0.5722 (0.5804) acc1: 96.8750 (96.7730) acc5: 100.0000 (99.9114) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [5] [300/468] eta: 0:00:06 lr: 0.0016691306063588583 img/s: 3528.3778177946606 loss: 0.5726 (0.5804) acc1: 97.6562 (96.7712) acc5: 100.0000 (99.9143) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [5] [310/468] eta: 0:00:06 lr: 0.0016691306063588583 img/s: 3517.6047803753013 loss: 0.5726 (0.5806) acc1: 96.8750 (96.7620) acc5: 100.0000 (99.9146) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [5] [320/468] eta: 0:00:05 lr: 0.0016691306063588583 img/s: 3548.433634284657 loss: 0.5642 (0.5801) acc1: 96.8750 (96.7825) acc5: 100.0000 (99.9173) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [5] [330/468] eta: 0:00:05 lr: 0.0016691306063588583 img/s: 3526.8480134538577 loss: 0.5545 (0.5798) acc1: 96.8750 (96.7782) acc5: 100.0000 (99.9198) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [5] [340/468] eta: 0:00:04 lr: 0.0016691306063588583 img/s: 3395.2740082087994 loss: 0.5802 (0.5803) acc1: 96.0938 (96.7536) acc5: 100.0000 (99.9198) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [5] [350/468] eta: 0:00:04 lr: 0.0016691306063588583 img/s: 3578.161382555435 loss: 0.5988 (0.5803) acc1: 96.0938 (96.7548) acc5: 100.0000 (99.9176) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [5] [360/468] eta: 0:00:04 lr: 0.0016691306063588583 img/s: 3519.4264774328885 loss: 0.5805 (0.5803) acc1: 96.8750 (96.7430) acc5: 100.0000 (99.9178) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [5] [370/468] eta: 0:00:03 lr: 0.0016691306063588583 img/s: 3484.5908483157004 loss: 0.5814 (0.5806) acc1: 96.8750 (96.7297) acc5: 100.0000 (99.9179) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [5] [380/468] eta: 0:00:03 lr: 0.0016691306063588583 img/s: 3446.738691079981 loss: 0.5775 (0.5806) acc1: 96.8750 (96.7315) acc5: 100.0000 (99.9180) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [5] [390/468] eta: 0:00:03 lr: 0.0016691306063588583 img/s: 3408.812419441887 loss: 0.5775 (0.5808) acc1: 96.8750 (96.7291) acc5: 100.0000 (99.9161) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [5] [400/468] eta: 0:00:02 lr: 0.0016691306063588583 img/s: 3489.619052571369 loss: 0.5905 (0.5811) acc1: 96.8750 (96.7191) acc5: 100.0000 (99.9104) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [5] [410/468] eta: 0:00:02 lr: 0.0016691306063588583 img/s: 3493.979486645494 loss: 0.5783 (0.5811) acc1: 96.8750 (96.7115) acc5: 100.0000 (99.9088) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [5] [420/468] eta: 0:00:01 lr: 0.0016691306063588583 img/s: 3499.240097767639 loss: 0.5662 (0.5808) acc1: 96.8750 (96.7191) acc5: 100.0000 (99.9109) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [5] [430/468] eta: 0:00:01 lr: 0.0016691306063588583 img/s: 3484.4777673211097 loss: 0.5600 (0.5802) acc1: 96.8750 (96.7391) acc5: 100.0000 (99.9112) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [5] [440/468] eta: 0:00:01 lr: 0.0016691306063588583 img/s: 3520.742038717801 loss: 0.5597 (0.5799) acc1: 97.6562 (96.7528) acc5: 100.0000 (99.9114) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [5] [450/468] eta: 0:00:00 lr: 0.0016691306063588583 img/s: 3472.4877398824115 loss: 0.5598 (0.5798) acc1: 96.8750 (96.7503) acc5: 100.0000 (99.9134) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [5] [460/468] eta: 0:00:00 lr: 0.0016691306063588583 img/s: 3504.310698876654 loss: 0.5630 (0.5794) acc1: 97.6562 (96.7615) acc5: 100.0000 (99.9153) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [5] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5471 (0.5471) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1204 data: 0.1010 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 97.440 Acc@5 99.980
Epoch: [6] [ 0/468] eta: 0:01:42 lr: 0.0015 img/s: 1579.9847319707471 loss: 0.6669 (0.6669) acc1: 93.7500 (93.7500) acc5: 100.0000 (100.0000) time: 0.2198 data: 0.1388 max mem: 190
Epoch: [6] [ 10/468] eta: 0:00:37 lr: 0.0015 img/s: 3150.1517482558515 loss: 0.5509 (0.5569) acc1: 98.4375 (97.9403) acc5: 100.0000 (100.0000) time: 0.0820 data: 0.0176 max mem: 190
Epoch: [6] [ 20/468] eta: 0:00:32 lr: 0.0015 img/s: 2385.509817600142 loss: 0.5436 (0.5496) acc1: 98.4375 (98.1027) acc5: 100.0000 (99.9256) time: 0.0648 data: 0.0048 max mem: 190
Epoch: [6] [ 30/468] eta: 0:00:28 lr: 0.0015 img/s: 3463.6386111146953 loss: 0.5402 (0.5503) acc1: 97.6562 (98.0091) acc5: 100.0000 (99.9496) time: 0.0559 data: 0.0043 max mem: 190
Epoch: [6] [ 40/468] eta: 0:00:24 lr: 0.0015 img/s: 3532.8589609449546 loss: 0.5531 (0.5517) acc1: 97.6562 (97.8849) acc5: 100.0000 (99.9428) time: 0.0438 data: 0.0024 max mem: 190
Epoch: [6] [ 50/468] eta: 0:00:22 lr: 0.0015 img/s: 3459.219793814433 loss: 0.5620 (0.5530) acc1: 97.6562 (97.8248) acc5: 100.0000 (99.9540) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [6] [ 60/468] eta: 0:00:20 lr: 0.0015 img/s: 3523.537983946655 loss: 0.5487 (0.5516) acc1: 97.6562 (97.8996) acc5: 100.0000 (99.9616) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [6] [ 70/468] eta: 0:00:19 lr: 0.0015 img/s: 3478.6978118459674 loss: 0.5407 (0.5519) acc1: 98.4375 (97.9093) acc5: 100.0000 (99.9670) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [6] [ 80/468] eta: 0:00:18 lr: 0.0015 img/s: 3461.6509791025915 loss: 0.5368 (0.5515) acc1: 97.6562 (97.9649) acc5: 100.0000 (99.9614) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [6] [ 90/468] eta: 0:00:17 lr: 0.0015 img/s: 3463.593920156899 loss: 0.5368 (0.5517) acc1: 98.4375 (97.9653) acc5: 100.0000 (99.9657) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [6] [100/468] eta: 0:00:16 lr: 0.0015 img/s: 3570.3563367449406 loss: 0.5376 (0.5519) acc1: 97.6562 (97.9502) acc5: 100.0000 (99.9691) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [6] [110/468] eta: 0:00:16 lr: 0.0015 img/s: 3560.765861488055 loss: 0.5520 (0.5527) acc1: 97.6562 (97.8885) acc5: 100.0000 (99.9578) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [6] [120/468] eta: 0:00:15 lr: 0.0015 img/s: 2785.958465226822 loss: 0.5322 (0.5515) acc1: 97.6562 (97.9468) acc5: 100.0000 (99.9548) time: 0.0393 data: 0.0003 max mem: 190
Epoch: [6] [130/468] eta: 0:00:15 lr: 0.0015 img/s: 2802.054875025444 loss: 0.5300 (0.5520) acc1: 97.6562 (97.8948) acc5: 100.0000 (99.9523) time: 0.0415 data: 0.0003 max mem: 190
Epoch: [6] [140/468] eta: 0:00:14 lr: 0.0015 img/s: 3535.5810547389497 loss: 0.5530 (0.5526) acc1: 97.6562 (97.8723) acc5: 100.0000 (99.9501) time: 0.0394 data: 0.0003 max mem: 190
Epoch: [6] [150/468] eta: 0:00:13 lr: 0.0015 img/s: 3535.7207623714125 loss: 0.5523 (0.5529) acc1: 97.6562 (97.8477) acc5: 100.0000 (99.9483) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [6] [160/468] eta: 0:00:13 lr: 0.0015 img/s: 3605.6287660008866 loss: 0.5583 (0.5537) acc1: 97.6562 (97.8164) acc5: 100.0000 (99.9515) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [6] [170/468] eta: 0:00:12 lr: 0.0015 img/s: 3558.688814943458 loss: 0.5609 (0.5539) acc1: 97.6562 (97.8207) acc5: 100.0000 (99.9543) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [6] [180/468] eta: 0:00:12 lr: 0.0015 img/s: 3531.743416681468 loss: 0.5429 (0.5535) acc1: 97.6562 (97.8289) acc5: 100.0000 (99.9568) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [6] [190/468] eta: 0:00:11 lr: 0.0015 img/s: 3577.5176053522405 loss: 0.5429 (0.5534) acc1: 97.6562 (97.8321) acc5: 100.0000 (99.9550) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [6] [200/468] eta: 0:00:11 lr: 0.0015 img/s: 3349.2470928781754 loss: 0.5524 (0.5535) acc1: 97.6562 (97.8234) acc5: 100.0000 (99.9572) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [6] [210/468] eta: 0:00:10 lr: 0.0015 img/s: 3514.5421290022714 loss: 0.5593 (0.5540) acc1: 97.6562 (97.8044) acc5: 100.0000 (99.9556) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [6] [220/468] eta: 0:00:10 lr: 0.0015 img/s: 3602.0967767907464 loss: 0.5576 (0.5543) acc1: 97.6562 (97.7835) acc5: 100.0000 (99.9540) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [6] [230/468] eta: 0:00:09 lr: 0.0015 img/s: 3557.816234700031 loss: 0.5456 (0.5544) acc1: 97.6562 (97.7746) acc5: 100.0000 (99.9527) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [6] [240/468] eta: 0:00:09 lr: 0.0015 img/s: 3482.5791033932496 loss: 0.5582 (0.5549) acc1: 96.8750 (97.7470) acc5: 100.0000 (99.9546) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [6] [250/468] eta: 0:00:08 lr: 0.0015 img/s: 3449.9502753555203 loss: 0.5683 (0.5558) acc1: 96.8750 (97.7185) acc5: 100.0000 (99.9533) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [6] [260/468] eta: 0:00:08 lr: 0.0015 img/s: 3443.1134769057117 loss: 0.5683 (0.5562) acc1: 96.8750 (97.7011) acc5: 100.0000 (99.9551) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [6] [270/468] eta: 0:00:08 lr: 0.0015 img/s: 3272.545531017415 loss: 0.5686 (0.5569) acc1: 96.8750 (97.6764) acc5: 100.0000 (99.9539) time: 0.0466 data: 0.0031 max mem: 190
Epoch: [6] [280/468] eta: 0:00:07 lr: 0.0015 img/s: 3548.8793024808465 loss: 0.5686 (0.5571) acc1: 96.8750 (97.6646) acc5: 100.0000 (99.9555) time: 0.0494 data: 0.0035 max mem: 190
Epoch: [6] [290/468] eta: 0:00:07 lr: 0.0015 img/s: 3544.848908227744 loss: 0.5674 (0.5572) acc1: 97.6562 (97.6616) acc5: 100.0000 (99.9570) time: 0.0401 data: 0.0007 max mem: 190
Epoch: [6] [300/468] eta: 0:00:06 lr: 0.0015 img/s: 3525.4352825294677 loss: 0.5674 (0.5573) acc1: 97.6562 (97.6640) acc5: 100.0000 (99.9533) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [6] [310/468] eta: 0:00:06 lr: 0.0015 img/s: 3477.2558178697495 loss: 0.5506 (0.5572) acc1: 97.6562 (97.6613) acc5: 100.0000 (99.9548) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [6] [320/468] eta: 0:00:06 lr: 0.0015 img/s: 3415.2310892562928 loss: 0.5397 (0.5569) acc1: 97.6562 (97.6830) acc5: 100.0000 (99.9538) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [6] [330/468] eta: 0:00:05 lr: 0.0015 img/s: 3478.066792347709 loss: 0.5433 (0.5572) acc1: 97.6562 (97.6657) acc5: 100.0000 (99.9528) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [6] [340/468] eta: 0:00:05 lr: 0.0015 img/s: 2952.0947976751477 loss: 0.5600 (0.5575) acc1: 96.8750 (97.6631) acc5: 100.0000 (99.9496) time: 0.0409 data: 0.0013 max mem: 190
Epoch: [6] [350/468] eta: 0:00:04 lr: 0.0015 img/s: 2809.577425871074 loss: 0.5623 (0.5582) acc1: 96.8750 (97.6295) acc5: 100.0000 (99.9510) time: 0.0456 data: 0.0019 max mem: 190
Epoch: [6] [360/468] eta: 0:00:04 lr: 0.0015 img/s: 3442.8264385432767 loss: 0.5646 (0.5587) acc1: 97.6562 (97.6151) acc5: 100.0000 (99.9459) time: 0.0457 data: 0.0021 max mem: 190
Epoch: [6] [370/468] eta: 0:00:04 lr: 0.0015 img/s: 3640.8888888888887 loss: 0.5475 (0.5584) acc1: 97.6562 (97.6183) acc5: 100.0000 (99.9474) time: 0.0406 data: 0.0015 max mem: 190
Epoch: [6] [380/468] eta: 0:00:03 lr: 0.0015 img/s: 3591.614286956696 loss: 0.5605 (0.5589) acc1: 96.8750 (97.5968) acc5: 100.0000 (99.9467) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [6] [390/468] eta: 0:00:03 lr: 0.0015 img/s: 3500.0157245209953 loss: 0.5546 (0.5586) acc1: 97.6562 (97.6143) acc5: 100.0000 (99.9480) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [6] [400/468] eta: 0:00:02 lr: 0.0015 img/s: 3382.9508188457394 loss: 0.5446 (0.5587) acc1: 97.6562 (97.6114) acc5: 100.0000 (99.9493) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [6] [410/468] eta: 0:00:02 lr: 0.0015 img/s: 3362.062260074522 loss: 0.5734 (0.5591) acc1: 96.8750 (97.5840) acc5: 100.0000 (99.9487) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [6] [420/468] eta: 0:00:01 lr: 0.0015 img/s: 3414.861795237126 loss: 0.5567 (0.5587) acc1: 96.8750 (97.5987) acc5: 100.0000 (99.9499) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [6] [430/468] eta: 0:00:01 lr: 0.0015 img/s: 3449.7507614408905 loss: 0.5381 (0.5586) acc1: 98.4375 (97.5964) acc5: 100.0000 (99.9511) time: 0.0389 data: 0.0003 max mem: 190
Epoch: [6] [440/468] eta: 0:00:01 lr: 0.0015 img/s: 3363.6210035649174 loss: 0.5545 (0.5587) acc1: 97.6562 (97.5942) acc5: 100.0000 (99.9504) time: 0.0389 data: 0.0003 max mem: 190
Epoch: [6] [450/468] eta: 0:00:00 lr: 0.0015 img/s: 3469.7497689508755 loss: 0.5516 (0.5587) acc1: 97.6562 (97.5956) acc5: 100.0000 (99.9498) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [6] [460/468] eta: 0:00:00 lr: 0.0015 img/s: 3430.682352339751 loss: 0.5530 (0.5589) acc1: 97.6562 (97.5851) acc5: 100.0000 (99.9492) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [6] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5535 (0.5535) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.1226 data: 0.0983 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 97.780 Acc@5 99.980
Epoch: [7] [ 0/468] eta: 0:01:16 lr: 0.0013090169943749475 img/s: 2489.0627005174047 loss: 0.5265 (0.5265) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1624 data: 0.1110 max mem: 190
Epoch: [7] [ 10/468] eta: 0:00:25 lr: 0.0013090169943749475 img/s: 3451.502838370396 loss: 0.5363 (0.5397) acc1: 97.6562 (98.2955) acc5: 100.0000 (100.0000) time: 0.0547 data: 0.0117 max mem: 190
Epoch: [7] [ 20/468] eta: 0:00:21 lr: 0.0013090169943749475 img/s: 3377.842518198806 loss: 0.5363 (0.5402) acc1: 98.4375 (98.2887) acc5: 100.0000 (100.0000) time: 0.0429 data: 0.0014 max mem: 190
Epoch: [7] [ 30/468] eta: 0:00:20 lr: 0.0013090169943749475 img/s: 1929.3512011931073 loss: 0.5474 (0.5520) acc1: 97.6562 (97.7571) acc5: 100.0000 (99.9496) time: 0.0422 data: 0.0007 max mem: 190
Epoch: [7] [ 40/468] eta: 0:00:20 lr: 0.0013090169943749475 img/s: 3102.421348866506 loss: 0.5683 (0.5610) acc1: 96.0938 (97.3133) acc5: 100.0000 (99.9428) time: 0.0472 data: 0.0019 max mem: 190
Epoch: [7] [ 50/468] eta: 0:00:19 lr: 0.0013090169943749475 img/s: 3340.3282148279036 loss: 0.5635 (0.5618) acc1: 96.0938 (97.3039) acc5: 100.0000 (99.9540) time: 0.0484 data: 0.0025 max mem: 190
Epoch: [7] [ 60/468] eta: 0:00:18 lr: 0.0013090169943749475 img/s: 3524.926050673968 loss: 0.5575 (0.5615) acc1: 97.6562 (97.3361) acc5: 100.0000 (99.9616) time: 0.0415 data: 0.0009 max mem: 190
Epoch: [7] [ 70/468] eta: 0:00:17 lr: 0.0013090169943749475 img/s: 3519.4264774328885 loss: 0.5377 (0.5585) acc1: 98.4375 (97.5022) acc5: 100.0000 (99.9670) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [7] [ 80/468] eta: 0:00:16 lr: 0.0013090169943749475 img/s: 3478.787976180448 loss: 0.5332 (0.5582) acc1: 98.4375 (97.5598) acc5: 100.0000 (99.9711) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [7] [ 90/468] eta: 0:00:16 lr: 0.0013090169943749475 img/s: 3446.9821188949027 loss: 0.5633 (0.5601) acc1: 97.6562 (97.4931) acc5: 100.0000 (99.9657) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [7] [100/468] eta: 0:00:15 lr: 0.0013090169943749475 img/s: 3419.908474749019 loss: 0.5747 (0.5619) acc1: 97.6562 (97.4629) acc5: 100.0000 (99.9536) time: 0.0377 data: 0.0002 max mem: 190
Epoch: [7] [110/468] eta: 0:00:15 lr: 0.0013090169943749475 img/s: 3515.669853576761 loss: 0.5706 (0.5618) acc1: 97.6562 (97.4944) acc5: 100.0000 (99.9507) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [7] [120/468] eta: 0:00:14 lr: 0.0013090169943749475 img/s: 3549.489345665871 loss: 0.5600 (0.5613) acc1: 97.6562 (97.5465) acc5: 100.0000 (99.9548) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [7] [130/468] eta: 0:00:14 lr: 0.0013090169943749475 img/s: 2857.048879522753 loss: 0.5624 (0.5615) acc1: 97.6562 (97.5310) acc5: 100.0000 (99.9404) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [7] [140/468] eta: 0:00:13 lr: 0.0013090169943749475 img/s: 3495.7767894932185 loss: 0.5542 (0.5605) acc1: 97.6562 (97.5842) acc5: 100.0000 (99.9446) time: 0.0405 data: 0.0003 max mem: 190
Epoch: [7] [150/468] eta: 0:00:13 lr: 0.0013090169943749475 img/s: 3554.0007811413934 loss: 0.5517 (0.5603) acc1: 98.4375 (97.6407) acc5: 100.0000 (99.9483) time: 0.0396 data: 0.0003 max mem: 190
Epoch: [7] [160/468] eta: 0:00:12 lr: 0.0013090169943749475 img/s: 3441.9214771124502 loss: 0.5565 (0.5605) acc1: 98.4375 (97.6271) acc5: 100.0000 (99.9369) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [170/468] eta: 0:00:12 lr: 0.0013090169943749475 img/s: 3572.7798651733247 loss: 0.5565 (0.5605) acc1: 97.6562 (97.6243) acc5: 100.0000 (99.9269) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [7] [180/468] eta: 0:00:11 lr: 0.0013090169943749475 img/s: 3475.117561007185 loss: 0.5568 (0.5610) acc1: 97.6562 (97.6088) acc5: 100.0000 (99.9309) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [7] [190/468] eta: 0:00:11 lr: 0.0013090169943749475 img/s: 3443.1797233249745 loss: 0.5642 (0.5609) acc1: 97.6562 (97.6194) acc5: 100.0000 (99.9346) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [200/468] eta: 0:00:10 lr: 0.0013090169943749475 img/s: 3543.702389438944 loss: 0.5639 (0.5611) acc1: 97.6562 (97.6096) acc5: 100.0000 (99.9378) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [7] [210/468] eta: 0:00:10 lr: 0.0013090169943749475 img/s: 3572.684762861763 loss: 0.5502 (0.5611) acc1: 97.6562 (97.5970) acc5: 100.0000 (99.9408) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [7] [220/468] eta: 0:00:09 lr: 0.0013090169943749475 img/s: 3566.6087280022853 loss: 0.5558 (0.5613) acc1: 96.8750 (97.5785) acc5: 100.0000 (99.9434) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [7] [230/468] eta: 0:00:09 lr: 0.0013090169943749475 img/s: 3526.6626727626253 loss: 0.5505 (0.5606) acc1: 97.6562 (97.6089) acc5: 100.0000 (99.9459) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [7] [240/468] eta: 0:00:09 lr: 0.0013090169943749475 img/s: 3495.754027269531 loss: 0.5505 (0.5605) acc1: 98.4375 (97.6109) acc5: 100.0000 (99.9449) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [7] [250/468] eta: 0:00:08 lr: 0.0013090169943749475 img/s: 3506.9660519835124 loss: 0.5430 (0.5600) acc1: 97.6562 (97.6251) acc5: 100.0000 (99.9471) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [7] [260/468] eta: 0:00:08 lr: 0.0013090169943749475 img/s: 3502.4589128676184 loss: 0.5449 (0.5598) acc1: 98.4375 (97.6503) acc5: 100.0000 (99.9461) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [270/468] eta: 0:00:07 lr: 0.0013090169943749475 img/s: 3502.9845296585563 loss: 0.5531 (0.5596) acc1: 97.6562 (97.6534) acc5: 100.0000 (99.9452) time: 0.0378 data: 0.0002 max mem: 190
Epoch: [7] [280/468] eta: 0:00:07 lr: 0.0013090169943749475 img/s: 3531.3484970071695 loss: 0.5551 (0.5595) acc1: 97.6562 (97.6590) acc5: 100.0000 (99.9472) time: 0.0379 data: 0.0002 max mem: 190
Epoch: [7] [290/468] eta: 0:00:06 lr: 0.0013090169943749475 img/s: 3533.8589012782877 loss: 0.5554 (0.5593) acc1: 97.6562 (97.6562) acc5: 100.0000 (99.9463) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [7] [300/468] eta: 0:00:06 lr: 0.0013090169943749475 img/s: 3543.655608506818 loss: 0.5470 (0.5591) acc1: 97.6562 (97.6537) acc5: 100.0000 (99.9481) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [310/468] eta: 0:00:06 lr: 0.0013090169943749475 img/s: 1957.5682104910047 loss: 0.5547 (0.5592) acc1: 97.6562 (97.6437) acc5: 100.0000 (99.9498) time: 0.0412 data: 0.0003 max mem: 190
Epoch: [7] [320/468] eta: 0:00:06 lr: 0.0013090169943749475 img/s: 1364.188460815254 loss: 0.5524 (0.5589) acc1: 98.4375 (97.6611) acc5: 100.0000 (99.9489) time: 0.0678 data: 0.0030 max mem: 190
Epoch: [7] [330/468] eta: 0:00:05 lr: 0.0013090169943749475 img/s: 3207.1908050371576 loss: 0.5477 (0.5592) acc1: 97.6562 (97.6586) acc5: 100.0000 (99.9410) time: 0.0775 data: 0.0048 max mem: 190
Epoch: [7] [340/468] eta: 0:00:05 lr: 0.0013090169943749475 img/s: 3364.8437321767688 loss: 0.5512 (0.5589) acc1: 97.6562 (97.6631) acc5: 100.0000 (99.9427) time: 0.0515 data: 0.0021 max mem: 190
Epoch: [7] [350/468] eta: 0:00:04 lr: 0.0013090169943749475 img/s: 3425.0574935565364 loss: 0.5448 (0.5588) acc1: 97.6562 (97.6629) acc5: 100.0000 (99.9444) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [7] [360/468] eta: 0:00:04 lr: 0.0013090169943749475 img/s: 3450.127640432109 loss: 0.5366 (0.5583) acc1: 98.4375 (97.6887) acc5: 100.0000 (99.9459) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [7] [370/468] eta: 0:00:04 lr: 0.0013090169943749475 img/s: 3478.201991538875 loss: 0.5509 (0.5586) acc1: 97.6562 (97.6773) acc5: 100.0000 (99.9410) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [380/468] eta: 0:00:03 lr: 0.0013090169943749475 img/s: 3473.7684373989 loss: 0.5645 (0.5588) acc1: 96.8750 (97.6686) acc5: 100.0000 (99.9426) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [7] [390/468] eta: 0:00:03 lr: 0.0013090169943749475 img/s: 3551.344225858944 loss: 0.5611 (0.5590) acc1: 96.8750 (97.6443) acc5: 100.0000 (99.9441) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [7] [400/468] eta: 0:00:02 lr: 0.0013090169943749475 img/s: 3478.990863023108 loss: 0.5611 (0.5593) acc1: 96.8750 (97.6387) acc5: 100.0000 (99.9454) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [7] [410/468] eta: 0:00:02 lr: 0.0013090169943749475 img/s: 3477.931603666634 loss: 0.5499 (0.5592) acc1: 97.6562 (97.6448) acc5: 100.0000 (99.9468) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [7] [420/468] eta: 0:00:01 lr: 0.0013090169943749475 img/s: 3323.3109373742627 loss: 0.5408 (0.5588) acc1: 98.4375 (97.6581) acc5: 100.0000 (99.9480) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [7] [430/468] eta: 0:00:01 lr: 0.0013090169943749475 img/s: 3464.0632327416556 loss: 0.5350 (0.5586) acc1: 97.6562 (97.6581) acc5: 100.0000 (99.9492) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [7] [440/468] eta: 0:00:01 lr: 0.0013090169943749475 img/s: 3507.7221895540138 loss: 0.5480 (0.5584) acc1: 97.6562 (97.6704) acc5: 100.0000 (99.9504) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [7] [450/468] eta: 0:00:00 lr: 0.0013090169943749475 img/s: 3492.6611239054346 loss: 0.5523 (0.5584) acc1: 97.6562 (97.6649) acc5: 100.0000 (99.9515) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [7] [460/468] eta: 0:00:00 lr: 0.0013090169943749475 img/s: 3556.143021792409 loss: 0.5525 (0.5584) acc1: 97.6562 (97.6613) acc5: 100.0000 (99.9509) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [7] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5532 (0.5532) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.1250 data: 0.1056 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 97.530 Acc@5 99.950
Epoch: [8] [ 0/468] eta: 0:01:36 lr: 0.0011045284632676536 img/s: 2262.6431328916537 loss: 0.5521 (0.5521) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.2056 data: 0.1490 max mem: 190
Epoch: [8] [ 10/468] eta: 0:00:23 lr: 0.0011045284632676536 img/s: 3470.736735947248 loss: 0.5344 (0.5478) acc1: 98.4375 (97.8693) acc5: 100.0000 (100.0000) time: 0.0523 data: 0.0138 max mem: 190
Epoch: [8] [ 20/468] eta: 0:00:20 lr: 0.0011045284632676536 img/s: 3469.368591110594 loss: 0.5449 (0.5517) acc1: 97.6562 (97.7679) acc5: 100.0000 (99.9256) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [8] [ 30/468] eta: 0:00:18 lr: 0.0011045284632676536 img/s: 3585.66532422342 loss: 0.5611 (0.5518) acc1: 97.6562 (97.7823) acc5: 100.0000 (99.9496) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [8] [ 40/468] eta: 0:00:17 lr: 0.0011045284632676536 img/s: 3444.195821064044 loss: 0.5630 (0.5562) acc1: 96.8750 (97.5800) acc5: 100.0000 (99.9619) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [8] [ 50/468] eta: 0:00:16 lr: 0.0011045284632676536 img/s: 3318.9554337007526 loss: 0.5700 (0.5597) acc1: 96.8750 (97.4877) acc5: 100.0000 (99.9387) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [8] [ 60/468] eta: 0:00:16 lr: 0.0011045284632676536 img/s: 3511.8293507767785 loss: 0.5537 (0.5556) acc1: 97.6562 (97.6562) acc5: 100.0000 (99.9488) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [8] [ 70/468] eta: 0:00:15 lr: 0.0011045284632676536 img/s: 3525.57418948115 loss: 0.5447 (0.5569) acc1: 97.6562 (97.6012) acc5: 100.0000 (99.9450) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [8] [ 80/468] eta: 0:00:15 lr: 0.0011045284632676536 img/s: 2674.5390018631624 loss: 0.5500 (0.5547) acc1: 97.6562 (97.6852) acc5: 100.0000 (99.9518) time: 0.0420 data: 0.0008 max mem: 190
Epoch: [8] [ 90/468] eta: 0:00:15 lr: 0.0011045284632676536 img/s: 3554.4243162543116 loss: 0.5306 (0.5527) acc1: 98.4375 (97.7850) acc5: 100.0000 (99.9571) time: 0.0483 data: 0.0032 max mem: 190
Epoch: [8] [100/468] eta: 0:00:15 lr: 0.0011045284632676536 img/s: 3394.6514239465832 loss: 0.5306 (0.5515) acc1: 98.4375 (97.8342) acc5: 100.0000 (99.9536) time: 0.0435 data: 0.0027 max mem: 190
Epoch: [8] [110/468] eta: 0:00:14 lr: 0.0011045284632676536 img/s: 3309.278761280142 loss: 0.5341 (0.5502) acc1: 98.4375 (97.8885) acc5: 100.0000 (99.9578) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [8] [120/468] eta: 0:00:13 lr: 0.0011045284632676536 img/s: 3566.2059730045703 loss: 0.5408 (0.5502) acc1: 98.4375 (97.8758) acc5: 100.0000 (99.9613) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [8] [130/468] eta: 0:00:13 lr: 0.0011045284632676536 img/s: 3405.6121235957194 loss: 0.5434 (0.5499) acc1: 97.6562 (97.8769) acc5: 100.0000 (99.9642) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [8] [140/468] eta: 0:00:13 lr: 0.0011045284632676536 img/s: 3592.7439370415973 loss: 0.5348 (0.5492) acc1: 97.6562 (97.9000) acc5: 100.0000 (99.9612) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [8] [150/468] eta: 0:00:12 lr: 0.0011045284632676536 img/s: 3486.876656989394 loss: 0.5285 (0.5483) acc1: 98.4375 (97.9408) acc5: 100.0000 (99.9534) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [8] [160/468] eta: 0:00:12 lr: 0.0011045284632676536 img/s: 3574.635372763651 loss: 0.5372 (0.5475) acc1: 98.4375 (97.9862) acc5: 100.0000 (99.9515) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [8] [170/468] eta: 0:00:11 lr: 0.0011045284632676536 img/s: 3459.509572322424 loss: 0.5390 (0.5475) acc1: 98.4375 (97.9989) acc5: 100.0000 (99.9543) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [8] [180/468] eta: 0:00:11 lr: 0.0011045284632676536 img/s: 3559.8686576664986 loss: 0.5453 (0.5478) acc1: 98.4375 (98.0016) acc5: 100.0000 (99.9568) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [8] [190/468] eta: 0:00:10 lr: 0.0011045284632676536 img/s: 3072.609495902202 loss: 0.5619 (0.5487) acc1: 97.6562 (97.9917) acc5: 100.0000 (99.9591) time: 0.0396 data: 0.0003 max mem: 190
Epoch: [8] [200/468] eta: 0:00:10 lr: 0.0011045284632676536 img/s: 3079.9600254717143 loss: 0.5619 (0.5487) acc1: 97.6562 (97.9711) acc5: 100.0000 (99.9611) time: 0.0421 data: 0.0004 max mem: 190
Epoch: [8] [210/468] eta: 0:00:10 lr: 0.0011045284632676536 img/s: 3520.095675207847 loss: 0.5388 (0.5484) acc1: 97.6562 (98.0006) acc5: 100.0000 (99.9630) time: 0.0402 data: 0.0004 max mem: 190
Epoch: [8] [220/468] eta: 0:00:09 lr: 0.0011045284632676536 img/s: 3515.3705908159322 loss: 0.5334 (0.5481) acc1: 98.4375 (98.0098) acc5: 100.0000 (99.9646) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [8] [230/468] eta: 0:00:09 lr: 0.0011045284632676536 img/s: 3547.4488700938286 loss: 0.5408 (0.5483) acc1: 98.4375 (97.9978) acc5: 100.0000 (99.9628) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [8] [240/468] eta: 0:00:08 lr: 0.0011045284632676536 img/s: 3518.2271735355216 loss: 0.5406 (0.5481) acc1: 97.6562 (97.9999) acc5: 100.0000 (99.9611) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [8] [250/468] eta: 0:00:08 lr: 0.0011045284632676536 img/s: 3403.798411178809 loss: 0.5354 (0.5477) acc1: 98.4375 (98.0142) acc5: 100.0000 (99.9626) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [8] [260/468] eta: 0:00:08 lr: 0.0011045284632676536 img/s: 3418.7324851309872 loss: 0.5354 (0.5476) acc1: 98.4375 (98.0244) acc5: 100.0000 (99.9581) time: 0.0415 data: 0.0014 max mem: 190
Epoch: [8] [270/468] eta: 0:00:07 lr: 0.0011045284632676536 img/s: 3439.9366438136735 loss: 0.5399 (0.5473) acc1: 98.4375 (98.0339) acc5: 100.0000 (99.9596) time: 0.0417 data: 0.0014 max mem: 190
Epoch: [8] [280/468] eta: 0:00:07 lr: 0.0011045284632676536 img/s: 3443.4005631345685 loss: 0.5378 (0.5470) acc1: 98.4375 (98.0344) acc5: 100.0000 (99.9611) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [8] [290/468] eta: 0:00:06 lr: 0.0011045284632676536 img/s: 3593.2248547640083 loss: 0.5373 (0.5468) acc1: 97.6562 (98.0402) acc5: 100.0000 (99.9597) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [8] [300/468] eta: 0:00:06 lr: 0.0011045284632676536 img/s: 3540.967780657842 loss: 0.5535 (0.5469) acc1: 97.6562 (98.0456) acc5: 100.0000 (99.9611) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [8] [310/468] eta: 0:00:06 lr: 0.0011045284632676536 img/s: 3562.325238208987 loss: 0.5366 (0.5469) acc1: 98.4375 (98.0481) acc5: 100.0000 (99.9623) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [8] [320/468] eta: 0:00:05 lr: 0.0011045284632676536 img/s: 3435.7319612699266 loss: 0.5502 (0.5472) acc1: 96.8750 (98.0262) acc5: 100.0000 (99.9635) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [8] [330/468] eta: 0:00:05 lr: 0.0011045284632676536 img/s: 3478.8105179943755 loss: 0.5403 (0.5468) acc1: 97.6562 (98.0433) acc5: 100.0000 (99.9646) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [8] [340/468] eta: 0:00:04 lr: 0.0011045284632676536 img/s: 3482.1725160043325 loss: 0.5320 (0.5465) acc1: 98.4375 (98.0595) acc5: 100.0000 (99.9656) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [8] [350/468] eta: 0:00:04 lr: 0.0011045284632676536 img/s: 3570.237620865309 loss: 0.5320 (0.5463) acc1: 98.4375 (98.0747) acc5: 100.0000 (99.9666) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [8] [360/468] eta: 0:00:04 lr: 0.0011045284632676536 img/s: 3473.813391308849 loss: 0.5514 (0.5467) acc1: 98.4375 (98.0674) acc5: 100.0000 (99.9675) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [8] [370/468] eta: 0:00:03 lr: 0.0011045284632676536 img/s: 3036.1365176132604 loss: 0.5527 (0.5467) acc1: 97.6562 (98.0711) acc5: 100.0000 (99.9684) time: 0.0452 data: 0.0016 max mem: 190
Epoch: [8] [380/468] eta: 0:00:03 lr: 0.0011045284632676536 img/s: 3556.378590355061 loss: 0.5467 (0.5466) acc1: 97.6562 (98.0746) acc5: 100.0000 (99.9692) time: 0.0491 data: 0.0032 max mem: 190
Epoch: [8] [390/468] eta: 0:00:03 lr: 0.0011045284632676536 img/s: 3614.294450690382 loss: 0.5329 (0.5464) acc1: 98.4375 (98.0818) acc5: 100.0000 (99.9700) time: 0.0408 data: 0.0019 max mem: 190
Epoch: [8] [400/468] eta: 0:00:02 lr: 0.0011045284632676536 img/s: 3452.4350471045946 loss: 0.5323 (0.5460) acc1: 99.2188 (98.1024) acc5: 100.0000 (99.9708) time: 0.0368 data: 0.0004 max mem: 190
Epoch: [8] [410/468] eta: 0:00:02 lr: 0.0011045284632676536 img/s: 3487.9640334977034 loss: 0.5392 (0.5461) acc1: 98.4375 (98.1049) acc5: 100.0000 (99.9696) time: 0.0366 data: 0.0004 max mem: 190
Epoch: [8] [420/468] eta: 0:00:01 lr: 0.0011045284632676536 img/s: 3611.0611942908645 loss: 0.5458 (0.5460) acc1: 98.4375 (98.1072) acc5: 100.0000 (99.9703) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [8] [430/468] eta: 0:00:01 lr: 0.0011045284632676536 img/s: 3563.412884469873 loss: 0.5407 (0.5459) acc1: 98.4375 (98.1130) acc5: 100.0000 (99.9710) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [8] [440/468] eta: 0:00:01 lr: 0.0011045284632676536 img/s: 3511.278111694648 loss: 0.5311 (0.5457) acc1: 98.4375 (98.1257) acc5: 100.0000 (99.9699) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [8] [450/468] eta: 0:00:00 lr: 0.0011045284632676536 img/s: 3517.4895465475106 loss: 0.5311 (0.5458) acc1: 99.2188 (98.1257) acc5: 100.0000 (99.9688) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [8] [460/468] eta: 0:00:00 lr: 0.0011045284632676536 img/s: 3581.455421172358 loss: 0.5253 (0.5455) acc1: 99.2188 (98.1375) acc5: 100.0000 (99.9695) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [8] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5504 (0.5504) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1249 data: 0.1060 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.000 Acc@5 99.970
Epoch: [9] [ 0/468] eta: 0:01:14 lr: 0.0008954715367323467 img/s: 2598.3995043946256 loss: 0.5086 (0.5086) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1588 data: 0.1095 max mem: 190
Epoch: [9] [ 10/468] eta: 0:00:23 lr: 0.0008954715367323467 img/s: 2439.9008898462994 loss: 0.5330 (0.5354) acc1: 98.4375 (98.5085) acc5: 100.0000 (99.9290) time: 0.0515 data: 0.0102 max mem: 190
Epoch: [9] [ 20/468] eta: 0:00:20 lr: 0.0008954715367323467 img/s: 2203.5869723151436 loss: 0.5330 (0.5314) acc1: 98.4375 (98.7351) acc5: 100.0000 (99.9628) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [9] [ 30/468] eta: 0:00:19 lr: 0.0008954715367323467 img/s: 3566.5376469806683 loss: 0.5352 (0.5309) acc1: 98.4375 (98.6895) acc5: 100.0000 (99.9748) time: 0.0395 data: 0.0009 max mem: 190
Epoch: [9] [ 40/468] eta: 0:00:18 lr: 0.0008954715367323467 img/s: 3491.2756429848805 loss: 0.5292 (0.5306) acc1: 98.4375 (98.7424) acc5: 100.0000 (99.9809) time: 0.0376 data: 0.0009 max mem: 190
Epoch: [9] [ 50/468] eta: 0:00:17 lr: 0.0008954715367323467 img/s: 3387.283666464769 loss: 0.5334 (0.5317) acc1: 98.4375 (98.6520) acc5: 100.0000 (99.9847) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [9] [ 60/468] eta: 0:00:16 lr: 0.0008954715367323467 img/s: 3530.6517953439434 loss: 0.5342 (0.5321) acc1: 98.4375 (98.6680) acc5: 100.0000 (99.9872) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [9] [ 70/468] eta: 0:00:15 lr: 0.0008954715367323467 img/s: 3613.8078768990517 loss: 0.5293 (0.5325) acc1: 98.4375 (98.6356) acc5: 100.0000 (99.9890) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [9] [ 80/468] eta: 0:00:15 lr: 0.0008954715367323467 img/s: 3584.276876856828 loss: 0.5376 (0.5332) acc1: 98.4375 (98.5822) acc5: 100.0000 (99.9904) time: 0.0363 data: 0.0002 max mem: 190
Epoch: [9] [ 90/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 3455.723990550796 loss: 0.5376 (0.5334) acc1: 98.4375 (98.5834) acc5: 100.0000 (99.9914) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [9] [100/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 3509.9925599032395 loss: 0.5252 (0.5334) acc1: 99.2188 (98.5767) acc5: 100.0000 (99.9845) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [9] [110/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 2140.45439576431 loss: 0.5335 (0.5337) acc1: 98.4375 (98.5642) acc5: 100.0000 (99.9859) time: 0.0437 data: 0.0011 max mem: 190
Epoch: [9] [120/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 2856.7904303775913 loss: 0.5341 (0.5336) acc1: 98.4375 (98.5731) acc5: 100.0000 (99.9871) time: 0.0544 data: 0.0045 max mem: 190
Epoch: [9] [130/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 1459.1936682403655 loss: 0.5305 (0.5335) acc1: 99.2188 (98.5866) acc5: 100.0000 (99.9881) time: 0.0578 data: 0.0061 max mem: 190
Epoch: [9] [140/468] eta: 0:00:14 lr: 0.0008954715367323467 img/s: 2368.293897860965 loss: 0.5229 (0.5326) acc1: 99.2188 (98.6370) acc5: 100.0000 (99.9889) time: 0.0550 data: 0.0053 max mem: 190
Epoch: [9] [150/468] eta: 0:00:13 lr: 0.0008954715367323467 img/s: 3473.1841424283202 loss: 0.5229 (0.5323) acc1: 99.2188 (98.6445) acc5: 100.0000 (99.9897) time: 0.0454 data: 0.0028 max mem: 190
Epoch: [9] [160/468] eta: 0:00:13 lr: 0.0008954715367323467 img/s: 3528.6097221126797 loss: 0.5357 (0.5330) acc1: 98.4375 (98.6122) acc5: 100.0000 (99.9903) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [9] [170/468] eta: 0:00:12 lr: 0.0008954715367323467 img/s: 3583.1286298745936 loss: 0.5247 (0.5322) acc1: 98.4375 (98.6431) acc5: 100.0000 (99.9909) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [9] [180/468] eta: 0:00:12 lr: 0.0008954715367323467 img/s: 3462.253727493164 loss: 0.5234 (0.5326) acc1: 99.2188 (98.6360) acc5: 100.0000 (99.9871) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [9] [190/468] eta: 0:00:11 lr: 0.0008954715367323467 img/s: 3527.1955797620376 loss: 0.5289 (0.5326) acc1: 98.4375 (98.6420) acc5: 100.0000 (99.9877) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [9] [200/468] eta: 0:00:11 lr: 0.0008954715367323467 img/s: 3514.059039914124 loss: 0.5289 (0.5330) acc1: 98.4375 (98.6435) acc5: 100.0000 (99.9883) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [9] [210/468] eta: 0:00:10 lr: 0.0008954715367323467 img/s: 3603.982868574037 loss: 0.5418 (0.5334) acc1: 98.4375 (98.6226) acc5: 100.0000 (99.9852) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [9] [220/468] eta: 0:00:10 lr: 0.0008954715367323467 img/s: 3533.626306505542 loss: 0.5405 (0.5335) acc1: 98.4375 (98.6178) acc5: 100.0000 (99.9859) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [9] [230/468] eta: 0:00:09 lr: 0.0008954715367323467 img/s: 3551.649645080411 loss: 0.5366 (0.5336) acc1: 98.4375 (98.6134) acc5: 100.0000 (99.9831) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [9] [240/468] eta: 0:00:09 lr: 0.0008954715367323467 img/s: 3531.1626830134574 loss: 0.5348 (0.5340) acc1: 98.4375 (98.5996) acc5: 100.0000 (99.9838) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [9] [250/468] eta: 0:00:08 lr: 0.0008954715367323467 img/s: 3386.557194221914 loss: 0.5304 (0.5338) acc1: 98.4375 (98.6118) acc5: 100.0000 (99.9844) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [9] [260/468] eta: 0:00:08 lr: 0.0008954715367323467 img/s: 3556.143021792409 loss: 0.5290 (0.5342) acc1: 98.4375 (98.5961) acc5: 100.0000 (99.9790) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [9] [270/468] eta: 0:00:07 lr: 0.0008954715367323467 img/s: 3552.77780204218 loss: 0.5240 (0.5341) acc1: 98.4375 (98.6076) acc5: 100.0000 (99.9798) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [9] [280/468] eta: 0:00:07 lr: 0.0008954715367323467 img/s: 3469.7273444063853 loss: 0.5242 (0.5341) acc1: 98.4375 (98.6043) acc5: 100.0000 (99.9778) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [9] [290/468] eta: 0:00:07 lr: 0.0008954715367323467 img/s: 3458.595820341691 loss: 0.5285 (0.5343) acc1: 98.4375 (98.6013) acc5: 100.0000 (99.9758) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [9] [300/468] eta: 0:00:06 lr: 0.0008954715367323467 img/s: 3508.5245100281663 loss: 0.5416 (0.5351) acc1: 98.4375 (98.5751) acc5: 100.0000 (99.9740) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [9] [310/468] eta: 0:00:06 lr: 0.0008954715367323467 img/s: 3411.8886325649974 loss: 0.5515 (0.5356) acc1: 97.6562 (98.5480) acc5: 100.0000 (99.9749) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [9] [320/468] eta: 0:00:05 lr: 0.0008954715367323467 img/s: 3536.535943665312 loss: 0.5444 (0.5360) acc1: 97.6562 (98.5349) acc5: 100.0000 (99.9757) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [9] [330/468] eta: 0:00:05 lr: 0.0008954715367323467 img/s: 3589.9091407556 loss: 0.5402 (0.5362) acc1: 98.4375 (98.5272) acc5: 100.0000 (99.9740) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [9] [340/468] eta: 0:00:05 lr: 0.0008954715367323467 img/s: 3572.2093272384905 loss: 0.5437 (0.5365) acc1: 98.4375 (98.5085) acc5: 100.0000 (99.9725) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [9] [350/468] eta: 0:00:04 lr: 0.0008954715367323467 img/s: 3516.4758143221134 loss: 0.5372 (0.5365) acc1: 98.4375 (98.4976) acc5: 100.0000 (99.9733) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [9] [360/468] eta: 0:00:04 lr: 0.0008954715367323467 img/s: 3577.255392160129 loss: 0.5323 (0.5364) acc1: 98.4375 (98.4981) acc5: 100.0000 (99.9740) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [9] [370/468] eta: 0:00:03 lr: 0.0008954715367323467 img/s: 3451.6137892016304 loss: 0.5394 (0.5369) acc1: 97.6562 (98.4712) acc5: 100.0000 (99.9747) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [9] [380/468] eta: 0:00:03 lr: 0.0008954715367323467 img/s: 3403.3884345720335 loss: 0.5448 (0.5369) acc1: 98.4375 (98.4724) acc5: 100.0000 (99.9754) time: 0.0378 data: 0.0002 max mem: 190
Epoch: [9] [390/468] eta: 0:00:03 lr: 0.0008954715367323467 img/s: 3540.5707954680347 loss: 0.5286 (0.5366) acc1: 98.4375 (98.4895) acc5: 100.0000 (99.9760) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [9] [400/468] eta: 0:00:02 lr: 0.0008954715367323467 img/s: 3487.0804884385557 loss: 0.5253 (0.5365) acc1: 99.2188 (98.4901) acc5: 100.0000 (99.9747) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [9] [410/468] eta: 0:00:02 lr: 0.0008954715367323467 img/s: 3550.0291741056667 loss: 0.5383 (0.5368) acc1: 98.4375 (98.4793) acc5: 100.0000 (99.9734) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [9] [420/468] eta: 0:00:01 lr: 0.0008954715367323467 img/s: 1396.214262494181 loss: 0.5412 (0.5369) acc1: 98.4375 (98.4765) acc5: 100.0000 (99.9740) time: 0.0427 data: 0.0003 max mem: 190
Epoch: [9] [430/468] eta: 0:00:01 lr: 0.0008954715367323467 img/s: 3218.3997170484313 loss: 0.5426 (0.5372) acc1: 97.6562 (98.4665) acc5: 100.0000 (99.9746) time: 0.0484 data: 0.0027 max mem: 190
Epoch: [9] [440/468] eta: 0:00:01 lr: 0.0008954715367323467 img/s: 3407.51427755387 loss: 0.5414 (0.5373) acc1: 97.6562 (98.4588) acc5: 100.0000 (99.9752) time: 0.0437 data: 0.0030 max mem: 190
Epoch: [9] [450/468] eta: 0:00:00 lr: 0.0008954715367323467 img/s: 3551.7906255168537 loss: 0.5339 (0.5370) acc1: 98.4375 (98.4618) acc5: 100.0000 (99.9757) time: 0.0384 data: 0.0006 max mem: 190
Epoch: [9] [460/468] eta: 0:00:00 lr: 0.0008954715367323467 img/s: 3539.6604009942444 loss: 0.5245 (0.5369) acc1: 98.4375 (98.4697) acc5: 100.0000 (99.9746) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [9] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:11 loss: 0.5483 (0.5483) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1414 data: 0.1171 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.280 Acc@5 99.960
Epoch: [10] [ 0/468] eta: 0:01:13 lr: 0.0006909830056250527 img/s: 2420.8236928016163 loss: 0.5375 (0.5375) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1576 data: 0.1046 max mem: 190
Epoch: [10] [ 10/468] eta: 0:00:24 lr: 0.0006909830056250527 img/s: 3436.0398087643284 loss: 0.5144 (0.5181) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.0530 data: 0.0104 max mem: 190
Epoch: [10] [ 20/468] eta: 0:00:21 lr: 0.0006909830056250527 img/s: 3524.069946962138 loss: 0.5215 (0.5234) acc1: 99.2188 (99.1071) acc5: 100.0000 (100.0000) time: 0.0421 data: 0.0007 max mem: 190
Epoch: [10] [ 30/468] eta: 0:00:19 lr: 0.0006909830056250527 img/s: 3538.447269731422 loss: 0.5242 (0.5217) acc1: 99.2188 (99.1431) acc5: 100.0000 (100.0000) time: 0.0394 data: 0.0003 max mem: 190
Epoch: [10] [ 40/468] eta: 0:00:18 lr: 0.0006909830056250527 img/s: 3512.8633906955442 loss: 0.5161 (0.5239) acc1: 99.2188 (99.0091) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [10] [ 50/468] eta: 0:00:17 lr: 0.0006909830056250527 img/s: 3348.3280029936386 loss: 0.5179 (0.5234) acc1: 99.2188 (99.0656) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [10] [ 60/468] eta: 0:00:16 lr: 0.0006909830056250527 img/s: 3320.2280314415234 loss: 0.5163 (0.5226) acc1: 99.2188 (99.0907) acc5: 100.0000 (100.0000) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [10] [ 70/468] eta: 0:00:16 lr: 0.0006909830056250527 img/s: 3438.5706453513694 loss: 0.5123 (0.5220) acc1: 99.2188 (99.1087) acc5: 100.0000 (100.0000) time: 0.0385 data: 0.0003 max mem: 190
Epoch: [10] [ 80/468] eta: 0:00:15 lr: 0.0006909830056250527 img/s: 3534.6034103627626 loss: 0.5209 (0.5231) acc1: 99.2188 (99.0451) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [10] [ 90/468] eta: 0:00:15 lr: 0.0006909830056250527 img/s: 3577.0170498837356 loss: 0.5209 (0.5231) acc1: 99.2188 (99.0556) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [10] [100/468] eta: 0:00:14 lr: 0.0006909830056250527 img/s: 3524.069946962138 loss: 0.5268 (0.5248) acc1: 98.4375 (99.0022) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [10] [110/468] eta: 0:00:14 lr: 0.0006909830056250527 img/s: 3550.2639333421507 loss: 0.5263 (0.5245) acc1: 99.2188 (99.0217) acc5: 100.0000 (100.0000) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [10] [120/468] eta: 0:00:13 lr: 0.0006909830056250527 img/s: 3530.9072206985907 loss: 0.5166 (0.5249) acc1: 99.2188 (99.0121) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [10] [130/468] eta: 0:00:13 lr: 0.0006909830056250527 img/s: 3610.624055093751 loss: 0.5242 (0.5256) acc1: 99.2188 (98.9742) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [10] [140/468] eta: 0:00:12 lr: 0.0006909830056250527 img/s: 3567.912382370142 loss: 0.5208 (0.5253) acc1: 99.2188 (99.0082) acc5: 100.0000 (99.9945) time: 0.0363 data: 0.0002 max mem: 190
Epoch: [10] [150/468] eta: 0:00:12 lr: 0.0006909830056250527 img/s: 2762.3779244768484 loss: 0.5192 (0.5255) acc1: 99.2188 (99.0014) acc5: 100.0000 (99.9948) time: 0.0406 data: 0.0003 max mem: 190
Epoch: [10] [160/468] eta: 0:00:12 lr: 0.0006909830056250527 img/s: 3191.291160910658 loss: 0.5192 (0.5253) acc1: 99.2188 (99.0149) acc5: 100.0000 (99.9951) time: 0.0430 data: 0.0003 max mem: 190
Epoch: [10] [170/468] eta: 0:00:11 lr: 0.0006909830056250527 img/s: 3518.8267232960393 loss: 0.5229 (0.5257) acc1: 98.4375 (98.9766) acc5: 100.0000 (99.9954) time: 0.0413 data: 0.0003 max mem: 190
Epoch: [10] [180/468] eta: 0:00:11 lr: 0.0006909830056250527 img/s: 3413.4939311669073 loss: 0.5251 (0.5263) acc1: 98.4375 (98.9555) acc5: 100.0000 (99.9957) time: 0.0395 data: 0.0003 max mem: 190
Epoch: [10] [190/468] eta: 0:00:10 lr: 0.0006909830056250527 img/s: 1556.440813486599 loss: 0.5210 (0.5262) acc1: 99.2188 (98.9570) acc5: 100.0000 (99.9959) time: 0.0407 data: 0.0002 max mem: 190
Epoch: [10] [200/468] eta: 0:00:10 lr: 0.0006909830056250527 img/s: 2556.966489493437 loss: 0.5269 (0.5272) acc1: 98.4375 (98.9195) acc5: 100.0000 (99.9961) time: 0.0485 data: 0.0027 max mem: 190
Epoch: [10] [210/468] eta: 0:00:10 lr: 0.0006909830056250527 img/s: 3502.276126608042 loss: 0.5357 (0.5273) acc1: 98.4375 (98.9114) acc5: 100.0000 (99.9963) time: 0.0458 data: 0.0027 max mem: 190
Epoch: [10] [220/468] eta: 0:00:09 lr: 0.0006909830056250527 img/s: 3562.79804629433 loss: 0.5338 (0.5279) acc1: 98.4375 (98.8758) acc5: 100.0000 (99.9965) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [10] [230/468] eta: 0:00:09 lr: 0.0006909830056250527 img/s: 3597.125038525963 loss: 0.5187 (0.5277) acc1: 99.2188 (98.8907) acc5: 100.0000 (99.9966) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [10] [240/468] eta: 0:00:09 lr: 0.0006909830056250527 img/s: 3483.7994354498555 loss: 0.5226 (0.5282) acc1: 99.2188 (98.8654) acc5: 100.0000 (99.9968) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [10] [250/468] eta: 0:00:08 lr: 0.0006909830056250527 img/s: 3499.057647311856 loss: 0.5308 (0.5282) acc1: 98.4375 (98.8546) acc5: 100.0000 (99.9969) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [10] [260/468] eta: 0:00:08 lr: 0.0006909830056250527 img/s: 3570.4988062222756 loss: 0.5202 (0.5281) acc1: 99.2188 (98.8685) acc5: 100.0000 (99.9940) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [10] [270/468] eta: 0:00:07 lr: 0.0006909830056250527 img/s: 3405.525712509594 loss: 0.5211 (0.5282) acc1: 99.2188 (98.8555) acc5: 100.0000 (99.9942) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [10] [280/468] eta: 0:00:07 lr: 0.0006909830056250527 img/s: 3477.5711518904527 loss: 0.5305 (0.5286) acc1: 98.4375 (98.8295) acc5: 100.0000 (99.9944) time: 0.0385 data: 0.0003 max mem: 190
Epoch: [10] [290/468] eta: 0:00:06 lr: 0.0006909830056250527 img/s: 3542.112530349414 loss: 0.5300 (0.5288) acc1: 98.4375 (98.8187) acc5: 100.0000 (99.9946) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [10] [300/468] eta: 0:00:06 lr: 0.0006909830056250527 img/s: 3544.1234734159834 loss: 0.5232 (0.5288) acc1: 99.2188 (98.8268) acc5: 100.0000 (99.9948) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [10] [310/468] eta: 0:00:06 lr: 0.0006909830056250527 img/s: 3441.08315707162 loss: 0.5262 (0.5292) acc1: 99.2188 (98.8118) acc5: 100.0000 (99.9950) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [10] [320/468] eta: 0:00:05 lr: 0.0006909830056250527 img/s: 3481.9466751412247 loss: 0.5417 (0.5296) acc1: 98.4375 (98.7977) acc5: 100.0000 (99.9951) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [10] [330/468] eta: 0:00:05 lr: 0.0006909830056250527 img/s: 3582.1245170975812 loss: 0.5399 (0.5302) acc1: 97.6562 (98.7632) acc5: 100.0000 (99.9953) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [10] [340/468] eta: 0:00:04 lr: 0.0006909830056250527 img/s: 3493.865795484866 loss: 0.5399 (0.5307) acc1: 97.6562 (98.7422) acc5: 100.0000 (99.9954) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [10] [350/468] eta: 0:00:04 lr: 0.0006909830056250527 img/s: 3524.671489909269 loss: 0.5373 (0.5309) acc1: 98.4375 (98.7313) acc5: 100.0000 (99.9955) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [10] [360/468] eta: 0:00:04 lr: 0.0006909830056250527 img/s: 3499.719120753044 loss: 0.5248 (0.5307) acc1: 98.4375 (98.7383) acc5: 100.0000 (99.9957) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [10] [370/468] eta: 0:00:03 lr: 0.0006909830056250527 img/s: 3413.9063461783035 loss: 0.5168 (0.5305) acc1: 99.2188 (98.7449) acc5: 100.0000 (99.9958) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [10] [380/468] eta: 0:00:03 lr: 0.0006909830056250527 img/s: 3589.9571508813224 loss: 0.5213 (0.5303) acc1: 99.2188 (98.7533) acc5: 100.0000 (99.9959) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [10] [390/468] eta: 0:00:03 lr: 0.0006909830056250527 img/s: 3470.4450736273257 loss: 0.5237 (0.5301) acc1: 99.2188 (98.7612) acc5: 100.0000 (99.9960) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [10] [400/468] eta: 0:00:02 lr: 0.0006909830056250527 img/s: 3534.463791014905 loss: 0.5256 (0.5300) acc1: 99.2188 (98.7629) acc5: 100.0000 (99.9961) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [10] [410/468] eta: 0:00:02 lr: 0.0006909830056250527 img/s: 2385.032927587739 loss: 0.5269 (0.5300) acc1: 98.4375 (98.7625) acc5: 100.0000 (99.9962) time: 0.0418 data: 0.0006 max mem: 190
Epoch: [10] [420/468] eta: 0:00:01 lr: 0.0006909830056250527 img/s: 1669.8991660938293 loss: 0.5245 (0.5301) acc1: 98.4375 (98.7622) acc5: 100.0000 (99.9963) time: 0.0561 data: 0.0036 max mem: 190
Epoch: [10] [430/468] eta: 0:00:01 lr: 0.0006909830056250527 img/s: 3159.8113780561016 loss: 0.5219 (0.5300) acc1: 98.4375 (98.7583) acc5: 100.0000 (99.9964) time: 0.0600 data: 0.0044 max mem: 190
Epoch: [10] [440/468] eta: 0:00:01 lr: 0.0006909830056250527 img/s: 3148.636799230539 loss: 0.5261 (0.5301) acc1: 98.4375 (98.7564) acc5: 100.0000 (99.9929) time: 0.0488 data: 0.0015 max mem: 190
Epoch: [10] [450/468] eta: 0:00:00 lr: 0.0006909830056250527 img/s: 3290.860071104573 loss: 0.5279 (0.5301) acc1: 98.4375 (98.7562) acc5: 100.0000 (99.9931) time: 0.0413 data: 0.0004 max mem: 190
Epoch: [10] [460/468] eta: 0:00:00 lr: 0.0006909830056250527 img/s: 1988.366599260757 loss: 0.5279 (0.5301) acc1: 99.2188 (98.7578) acc5: 100.0000 (99.9932) time: 0.0415 data: 0.0003 max mem: 190
Epoch: [10] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:10 loss: 0.5544 (0.5544) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1328 data: 0.1155 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.270 Acc@5 99.970
Epoch: [11] [ 0/468] eta: 0:01:20 lr: 0.0005000000000000002 img/s: 2074.494628974173 loss: 0.5164 (0.5164) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.1717 data: 0.1100 max mem: 190
Epoch: [11] [ 10/468] eta: 0:00:25 lr: 0.0005000000000000002 img/s: 3200.5181197651195 loss: 0.5164 (0.5185) acc1: 99.2188 (99.1477) acc5: 100.0000 (100.0000) time: 0.0554 data: 0.0103 max mem: 190
Epoch: [11] [ 20/468] eta: 0:00:21 lr: 0.0005000000000000002 img/s: 3149.2278254544603 loss: 0.5217 (0.5232) acc1: 99.2188 (98.9583) acc5: 100.0000 (100.0000) time: 0.0423 data: 0.0004 max mem: 190
Epoch: [11] [ 30/468] eta: 0:00:20 lr: 0.0005000000000000002 img/s: 3227.5514728868584 loss: 0.5288 (0.5235) acc1: 99.2188 (98.9919) acc5: 100.0000 (100.0000) time: 0.0406 data: 0.0004 max mem: 190
Epoch: [11] [ 40/468] eta: 0:00:19 lr: 0.0005000000000000002 img/s: 3470.4226400946354 loss: 0.5152 (0.5219) acc1: 99.2188 (99.0663) acc5: 100.0000 (99.9809) time: 0.0402 data: 0.0003 max mem: 190
Epoch: [11] [ 50/468] eta: 0:00:18 lr: 0.0005000000000000002 img/s: 3297.773387879458 loss: 0.5106 (0.5212) acc1: 99.2188 (99.0656) acc5: 100.0000 (99.9847) time: 0.0390 data: 0.0003 max mem: 190
Epoch: [11] [ 60/468] eta: 0:00:17 lr: 0.0005000000000000002 img/s: 3524.5095158378467 loss: 0.5191 (0.5218) acc1: 99.2188 (99.0523) acc5: 100.0000 (99.9872) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [11] [ 70/468] eta: 0:00:16 lr: 0.0005000000000000002 img/s: 3443.8423277504444 loss: 0.5181 (0.5218) acc1: 99.2188 (99.0427) acc5: 100.0000 (99.9890) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [11] [ 80/468] eta: 0:00:15 lr: 0.0005000000000000002 img/s: 3546.980126849894 loss: 0.5163 (0.5221) acc1: 99.2188 (99.0355) acc5: 100.0000 (99.9904) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [11] [ 90/468] eta: 0:00:15 lr: 0.0005000000000000002 img/s: 3493.1610753975483 loss: 0.5179 (0.5222) acc1: 99.2188 (99.0299) acc5: 100.0000 (99.9914) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [11] [100/468] eta: 0:00:14 lr: 0.0005000000000000002 img/s: 3373.745770806626 loss: 0.5201 (0.5224) acc1: 99.2188 (99.0176) acc5: 100.0000 (99.9923) time: 0.0377 data: 0.0004 max mem: 190
Epoch: [11] [110/468] eta: 0:00:14 lr: 0.0005000000000000002 img/s: 3415.90471342767 loss: 0.5171 (0.5224) acc1: 99.2188 (99.0146) acc5: 100.0000 (99.9930) time: 0.0377 data: 0.0004 max mem: 190
Epoch: [11] [120/468] eta: 0:00:13 lr: 0.0005000000000000002 img/s: 3553.130498087334 loss: 0.5166 (0.5222) acc1: 99.2188 (99.0121) acc5: 100.0000 (99.9935) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [11] [130/468] eta: 0:00:13 lr: 0.0005000000000000002 img/s: 3494.0249651815116 loss: 0.5192 (0.5222) acc1: 99.2188 (99.0160) acc5: 100.0000 (99.9940) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [11] [140/468] eta: 0:00:12 lr: 0.0005000000000000002 img/s: 3481.5402354009275 loss: 0.5231 (0.5224) acc1: 99.2188 (99.0193) acc5: 100.0000 (99.9945) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [11] [150/468] eta: 0:00:12 lr: 0.0005000000000000002 img/s: 3100.5042389521586 loss: 0.5230 (0.5225) acc1: 99.2188 (99.0273) acc5: 100.0000 (99.9948) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [11] [160/468] eta: 0:00:12 lr: 0.0005000000000000002 img/s: 3358.0876940590715 loss: 0.5219 (0.5226) acc1: 99.2188 (99.0392) acc5: 100.0000 (99.9951) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [11] [170/468] eta: 0:00:11 lr: 0.0005000000000000002 img/s: 3427.9004456703574 loss: 0.5155 (0.5224) acc1: 99.2188 (99.0451) acc5: 100.0000 (99.9954) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [11] [180/468] eta: 0:00:11 lr: 0.0005000000000000002 img/s: 3247.5632096301 loss: 0.5159 (0.5225) acc1: 99.2188 (99.0331) acc5: 100.0000 (99.9957) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [11] [190/468] eta: 0:00:10 lr: 0.0005000000000000002 img/s: 3533.370487617067 loss: 0.5161 (0.5224) acc1: 99.2188 (99.0388) acc5: 100.0000 (99.9959) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [11] [200/468] eta: 0:00:10 lr: 0.0005000000000000002 img/s: 3471.7018145135216 loss: 0.5194 (0.5228) acc1: 98.4375 (99.0244) acc5: 100.0000 (99.9961) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [11] [210/468] eta: 0:00:10 lr: 0.0005000000000000002 img/s: 3124.103347143988 loss: 0.5306 (0.5233) acc1: 98.4375 (99.0077) acc5: 100.0000 (99.9963) time: 0.0385 data: 0.0003 max mem: 190
Epoch: [11] [220/468] eta: 0:00:09 lr: 0.0005000000000000002 img/s: 3118.1337344709223 loss: 0.5267 (0.5233) acc1: 99.2188 (99.0243) acc5: 100.0000 (99.9965) time: 0.0409 data: 0.0003 max mem: 190
Epoch: [11] [230/468] eta: 0:00:09 lr: 0.0005000000000000002 img/s: 2732.584679594849 loss: 0.5109 (0.5232) acc1: 100.0000 (99.0294) acc5: 100.0000 (99.9966) time: 0.0512 data: 0.0011 max mem: 190
Epoch: [11] [240/468] eta: 0:00:09 lr: 0.0005000000000000002 img/s: 3494.184149381374 loss: 0.5244 (0.5235) acc1: 98.4375 (99.0113) acc5: 100.0000 (99.9935) time: 0.0530 data: 0.0019 max mem: 190
Epoch: [11] [250/468] eta: 0:00:08 lr: 0.0005000000000000002 img/s: 3441.568450473089 loss: 0.5169 (0.5231) acc1: 99.2188 (99.0382) acc5: 100.0000 (99.9938) time: 0.0464 data: 0.0010 max mem: 190
Epoch: [11] [260/468] eta: 0:00:08 lr: 0.0005000000000000002 img/s: 3567.485626951957 loss: 0.5087 (0.5230) acc1: 99.2188 (99.0451) acc5: 100.0000 (99.9940) time: 0.0440 data: 0.0006 max mem: 190
Epoch: [11] [270/468] eta: 0:00:08 lr: 0.0005000000000000002 img/s: 3464.443245615167 loss: 0.5222 (0.5231) acc1: 99.2188 (99.0515) acc5: 100.0000 (99.9942) time: 0.0395 data: 0.0006 max mem: 190
Epoch: [11] [280/468] eta: 0:00:07 lr: 0.0005000000000000002 img/s: 3437.4478144228246 loss: 0.5222 (0.5231) acc1: 99.2188 (99.0547) acc5: 100.0000 (99.9944) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [11] [290/468] eta: 0:00:07 lr: 0.0005000000000000002 img/s: 3604.3216069606315 loss: 0.5156 (0.5230) acc1: 99.2188 (99.0630) acc5: 100.0000 (99.9946) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [11] [300/468] eta: 0:00:06 lr: 0.0005000000000000002 img/s: 3499.4225672513476 loss: 0.5130 (0.5227) acc1: 99.2188 (99.0734) acc5: 100.0000 (99.9922) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [11] [310/468] eta: 0:00:06 lr: 0.0005000000000000002 img/s: 3521.6196261069204 loss: 0.5125 (0.5226) acc1: 99.2188 (99.0756) acc5: 100.0000 (99.9925) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [11] [320/468] eta: 0:00:05 lr: 0.0005000000000000002 img/s: 3499.6506808685394 loss: 0.5142 (0.5223) acc1: 99.2188 (99.0873) acc5: 100.0000 (99.9927) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [11] [330/468] eta: 0:00:05 lr: 0.0005000000000000002 img/s: 3439.870522127466 loss: 0.5142 (0.5224) acc1: 99.2188 (99.0889) acc5: 100.0000 (99.9906) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [11] [340/468] eta: 0:00:05 lr: 0.0005000000000000002 img/s: 3406.044244811988 loss: 0.5148 (0.5224) acc1: 99.2188 (99.0927) acc5: 100.0000 (99.9908) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [11] [350/468] eta: 0:00:04 lr: 0.0005000000000000002 img/s: 3536.6524288217547 loss: 0.5167 (0.5225) acc1: 99.2188 (99.0919) acc5: 100.0000 (99.9889) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [11] [360/468] eta: 0:00:04 lr: 0.0005000000000000002 img/s: 3499.1944833699413 loss: 0.5167 (0.5223) acc1: 99.2188 (99.1041) acc5: 100.0000 (99.9870) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [11] [370/468] eta: 0:00:03 lr: 0.0005000000000000002 img/s: 3468.449624322456 loss: 0.5129 (0.5222) acc1: 99.2188 (99.1029) acc5: 100.0000 (99.9874) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [11] [380/468] eta: 0:00:03 lr: 0.0005000000000000002 img/s: 3447.4469402170425 loss: 0.5251 (0.5225) acc1: 99.2188 (99.0957) acc5: 100.0000 (99.9877) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [11] [390/468] eta: 0:00:03 lr: 0.0005000000000000002 img/s: 3585.473750292183 loss: 0.5270 (0.5227) acc1: 98.4375 (99.0869) acc5: 100.0000 (99.9880) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [11] [400/468] eta: 0:00:02 lr: 0.0005000000000000002 img/s: 3455.968689248516 loss: 0.5184 (0.5230) acc1: 99.2188 (99.0804) acc5: 100.0000 (99.9864) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [11] [410/468] eta: 0:00:02 lr: 0.0005000000000000002 img/s: 3584.205089860336 loss: 0.5204 (0.5232) acc1: 98.4375 (99.0705) acc5: 100.0000 (99.9867) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [11] [420/468] eta: 0:00:01 lr: 0.0005000000000000002 img/s: 3539.0304021094266 loss: 0.5257 (0.5233) acc1: 98.4375 (99.0684) acc5: 100.0000 (99.9852) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [11] [430/468] eta: 0:00:01 lr: 0.0005000000000000002 img/s: 3503.738951105542 loss: 0.5203 (0.5233) acc1: 99.2188 (99.0683) acc5: 100.0000 (99.9855) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [11] [440/468] eta: 0:00:01 lr: 0.0005000000000000002 img/s: 3516.6140161266026 loss: 0.5149 (0.5232) acc1: 99.2188 (99.0699) acc5: 100.0000 (99.9858) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [11] [450/468] eta: 0:00:00 lr: 0.0005000000000000002 img/s: 3491.661650125522 loss: 0.5159 (0.5232) acc1: 99.2188 (99.0698) acc5: 100.0000 (99.9861) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [11] [460/468] eta: 0:00:00 lr: 0.0005000000000000002 img/s: 3486.265305592353 loss: 0.5175 (0.5231) acc1: 99.2188 (99.0747) acc5: 100.0000 (99.9864) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [11] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:10 loss: 0.5518 (0.5518) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.1369 data: 0.1105 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.430 Acc@5 99.990
Epoch: [12] [ 0/468] eta: 0:02:14 lr: 0.0003308693936411421 img/s: 948.6122759093495 loss: 0.5233 (0.5233) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.2872 data: 0.1522 max mem: 190
Epoch: [12] [ 10/468] eta: 0:00:36 lr: 0.0003308693936411421 img/s: 3477.0756526751425 loss: 0.5233 (0.5273) acc1: 99.2188 (98.9347) acc5: 100.0000 (100.0000) time: 0.0804 data: 0.0196 max mem: 190
Epoch: [12] [ 20/468] eta: 0:00:26 lr: 0.0003308693936411421 img/s: 3368.8977353304763 loss: 0.5224 (0.5260) acc1: 99.2188 (99.0327) acc5: 100.0000 (100.0000) time: 0.0488 data: 0.0033 max mem: 190
Epoch: [12] [ 30/468] eta: 0:00:23 lr: 0.0003308693936411421 img/s: 3471.9487812922375 loss: 0.5112 (0.5222) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [12] [ 40/468] eta: 0:00:21 lr: 0.0003308693936411421 img/s: 3426.893938620232 loss: 0.5088 (0.5198) acc1: 99.2188 (99.2950) acc5: 100.0000 (100.0000) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [12] [ 50/468] eta: 0:00:19 lr: 0.0003308693936411421 img/s: 3567.248584717608 loss: 0.5111 (0.5197) acc1: 99.2188 (99.3107) acc5: 100.0000 (99.9694) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [12] [ 60/468] eta: 0:00:18 lr: 0.0003308693936411421 img/s: 3598.4511009082075 loss: 0.5193 (0.5200) acc1: 99.2188 (99.2572) acc5: 100.0000 (99.9744) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [12] [ 70/468] eta: 0:00:17 lr: 0.0003308693936411421 img/s: 3365.2866634907105 loss: 0.5134 (0.5187) acc1: 99.2188 (99.2958) acc5: 100.0000 (99.9780) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [12] [ 80/468] eta: 0:00:16 lr: 0.0003308693936411421 img/s: 3487.42026048264 loss: 0.5116 (0.5181) acc1: 99.2188 (99.3248) acc5: 100.0000 (99.9807) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [12] [ 90/468] eta: 0:00:16 lr: 0.0003308693936411421 img/s: 2787.694391077233 loss: 0.5146 (0.5188) acc1: 99.2188 (99.2788) acc5: 100.0000 (99.9742) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [12] [100/468] eta: 0:00:15 lr: 0.0003308693936411421 img/s: 3409.505166293034 loss: 0.5101 (0.5178) acc1: 99.2188 (99.3038) acc5: 100.0000 (99.9768) time: 0.0386 data: 0.0002 max mem: 190
Epoch: [12] [110/468] eta: 0:00:14 lr: 0.0003308693936411421 img/s: 3557.1090512757655 loss: 0.5063 (0.5172) acc1: 100.0000 (99.3243) acc5: 100.0000 (99.9789) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [12] [120/468] eta: 0:00:14 lr: 0.0003308693936411421 img/s: 3579.1871358284775 loss: 0.5063 (0.5173) acc1: 100.0000 (99.3285) acc5: 100.0000 (99.9806) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [12] [130/468] eta: 0:00:13 lr: 0.0003308693936411421 img/s: 3457.0591318570223 loss: 0.5063 (0.5167) acc1: 99.2188 (99.3380) acc5: 100.0000 (99.9821) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [12] [140/468] eta: 0:00:13 lr: 0.0003308693936411421 img/s: 3483.822042257177 loss: 0.5083 (0.5166) acc1: 99.2188 (99.3351) acc5: 100.0000 (99.9834) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [12] [150/468] eta: 0:00:12 lr: 0.0003308693936411421 img/s: 3503.2359673735727 loss: 0.5090 (0.5161) acc1: 99.2188 (99.3429) acc5: 100.0000 (99.9845) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [12] [160/468] eta: 0:00:12 lr: 0.0003308693936411421 img/s: 3429.1046543563934 loss: 0.5088 (0.5158) acc1: 99.2188 (99.3595) acc5: 100.0000 (99.9854) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [12] [170/468] eta: 0:00:11 lr: 0.0003308693936411421 img/s: 3477.0981723034674 loss: 0.5088 (0.5158) acc1: 99.2188 (99.3604) acc5: 100.0000 (99.9863) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [12] [180/468] eta: 0:00:11 lr: 0.0003308693936411421 img/s: 2182.446439968292 loss: 0.5171 (0.5164) acc1: 99.2188 (99.3482) acc5: 100.0000 (99.9871) time: 0.0454 data: 0.0003 max mem: 190
Epoch: [12] [190/468] eta: 0:00:11 lr: 0.0003308693936411421 img/s: 2737.196131315037 loss: 0.5142 (0.5163) acc1: 99.2188 (99.3496) acc5: 100.0000 (99.9877) time: 0.0563 data: 0.0035 max mem: 190
Epoch: [12] [200/468] eta: 0:00:11 lr: 0.0003308693936411421 img/s: 3381.0972755784514 loss: 0.5142 (0.5168) acc1: 99.2188 (99.3276) acc5: 100.0000 (99.9883) time: 0.0535 data: 0.0055 max mem: 190
Epoch: [12] [210/468] eta: 0:00:10 lr: 0.0003308693936411421 img/s: 3422.4374123466864 loss: 0.5139 (0.5166) acc1: 99.2188 (99.3372) acc5: 100.0000 (99.9889) time: 0.0428 data: 0.0023 max mem: 190
Epoch: [12] [220/468] eta: 0:00:10 lr: 0.0003308693936411421 img/s: 3410.0898904951855 loss: 0.5099 (0.5163) acc1: 100.0000 (99.3460) acc5: 100.0000 (99.9894) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [12] [230/468] eta: 0:00:09 lr: 0.0003308693936411421 img/s: 3545.2936763695916 loss: 0.5056 (0.5161) acc1: 100.0000 (99.3506) acc5: 100.0000 (99.9899) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [12] [240/468] eta: 0:00:09 lr: 0.0003308693936411421 img/s: 3464.9798763408244 loss: 0.5121 (0.5160) acc1: 99.2188 (99.3517) acc5: 100.0000 (99.9903) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [12] [250/468] eta: 0:00:08 lr: 0.0003308693936411421 img/s: 3481.901510483887 loss: 0.5145 (0.5162) acc1: 99.2188 (99.3464) acc5: 100.0000 (99.9907) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [12] [260/468] eta: 0:00:08 lr: 0.0003308693936411421 img/s: 3417.1875067628207 loss: 0.5087 (0.5161) acc1: 99.2188 (99.3534) acc5: 100.0000 (99.9880) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [12] [270/468] eta: 0:00:08 lr: 0.0003308693936411421 img/s: 2321.513592984489 loss: 0.5086 (0.5161) acc1: 99.2188 (99.3571) acc5: 100.0000 (99.9885) time: 0.0396 data: 0.0003 max mem: 190
Epoch: [12] [280/468] eta: 0:00:07 lr: 0.0003308693936411421 img/s: 2544.41190521327 loss: 0.5116 (0.5161) acc1: 99.2188 (99.3550) acc5: 100.0000 (99.9889) time: 0.0479 data: 0.0041 max mem: 190
Epoch: [12] [290/468] eta: 0:00:07 lr: 0.0003308693936411421 img/s: 3506.5995571608655 loss: 0.5128 (0.5162) acc1: 99.2188 (99.3449) acc5: 100.0000 (99.9893) time: 0.0457 data: 0.0041 max mem: 190
Epoch: [12] [300/468] eta: 0:00:06 lr: 0.0003308693936411421 img/s: 3373.660969233863 loss: 0.5145 (0.5163) acc1: 99.2188 (99.3433) acc5: 100.0000 (99.9896) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [12] [310/468] eta: 0:00:06 lr: 0.0003308693936411421 img/s: 3646.9483394583285 loss: 0.5117 (0.5164) acc1: 99.2188 (99.3343) acc5: 100.0000 (99.9900) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [12] [320/468] eta: 0:00:06 lr: 0.0003308693936411421 img/s: 2982.964190266642 loss: 0.5159 (0.5167) acc1: 99.2188 (99.3137) acc5: 100.0000 (99.9903) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [12] [330/468] eta: 0:00:05 lr: 0.0003308693936411421 img/s: 3034.3518772854954 loss: 0.5159 (0.5167) acc1: 99.2188 (99.3202) acc5: 100.0000 (99.9906) time: 0.0444 data: 0.0004 max mem: 190
Epoch: [12] [340/468] eta: 0:00:05 lr: 0.0003308693936411421 img/s: 3081.7810433504774 loss: 0.5096 (0.5168) acc1: 99.2188 (99.3173) acc5: 100.0000 (99.9908) time: 0.0476 data: 0.0014 max mem: 190
Epoch: [12] [350/468] eta: 0:00:04 lr: 0.0003308693936411421 img/s: 3615.1219269125363 loss: 0.5097 (0.5167) acc1: 99.2188 (99.3234) acc5: 100.0000 (99.9911) time: 0.0413 data: 0.0013 max mem: 190
Epoch: [12] [360/468] eta: 0:00:04 lr: 0.0003308693936411421 img/s: 3503.578895161027 loss: 0.5110 (0.5166) acc1: 99.2188 (99.3248) acc5: 100.0000 (99.9913) time: 0.0364 data: 0.0003 max mem: 190
Epoch: [12] [370/468] eta: 0:00:04 lr: 0.0003308693936411421 img/s: 3446.561674263337 loss: 0.5092 (0.5166) acc1: 99.2188 (99.3240) acc5: 100.0000 (99.9916) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [12] [380/468] eta: 0:00:03 lr: 0.0003308693936411421 img/s: 3527.2419271124195 loss: 0.5092 (0.5165) acc1: 99.2188 (99.3233) acc5: 100.0000 (99.9918) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [12] [390/468] eta: 0:00:03 lr: 0.0003308693936411421 img/s: 3609.555938037866 loss: 0.5093 (0.5164) acc1: 99.2188 (99.3286) acc5: 100.0000 (99.9920) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [12] [400/468] eta: 0:00:02 lr: 0.0003308693936411421 img/s: 3477.7288403487632 loss: 0.5089 (0.5163) acc1: 99.2188 (99.3298) acc5: 100.0000 (99.9922) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [12] [410/468] eta: 0:00:02 lr: 0.0003308693936411421 img/s: 3580.810458213833 loss: 0.5094 (0.5162) acc1: 99.2188 (99.3309) acc5: 100.0000 (99.9924) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [12] [420/468] eta: 0:00:01 lr: 0.0003308693936411421 img/s: 3572.375715312342 loss: 0.5097 (0.5163) acc1: 99.2188 (99.3338) acc5: 100.0000 (99.9926) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [12] [430/468] eta: 0:00:01 lr: 0.0003308693936411421 img/s: 3535.953632962748 loss: 0.5195 (0.5164) acc1: 99.2188 (99.3293) acc5: 100.0000 (99.9927) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [12] [440/468] eta: 0:00:01 lr: 0.0003308693936411421 img/s: 3534.1380554275556 loss: 0.5130 (0.5163) acc1: 99.2188 (99.3339) acc5: 100.0000 (99.9929) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [12] [450/468] eta: 0:00:00 lr: 0.0003308693936411421 img/s: 3457.749359164273 loss: 0.5076 (0.5163) acc1: 99.2188 (99.3383) acc5: 100.0000 (99.9931) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [12] [460/468] eta: 0:00:00 lr: 0.0003308693936411421 img/s: 3558.1227681826012 loss: 0.5090 (0.5163) acc1: 99.2188 (99.3374) acc5: 100.0000 (99.9932) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [12] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5338 (0.5338) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.1201 data: 0.1064 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.620 Acc@5 99.970
Epoch: [13] [ 0/468] eta: 0:01:24 lr: 0.00019098300562505265 img/s: 1682.8015572007998 loss: 0.5060 (0.5060) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1802 data: 0.1041 max mem: 190
Epoch: [13] [ 10/468] eta: 0:00:23 lr: 0.00019098300562505265 img/s: 3474.173064478555 loss: 0.5087 (0.5116) acc1: 100.0000 (99.7159) acc5: 100.0000 (100.0000) time: 0.0514 data: 0.0097 max mem: 190
Epoch: [13] [ 20/468] eta: 0:00:20 lr: 0.00019098300562505265 img/s: 3444.748299668917 loss: 0.5079 (0.5102) acc1: 100.0000 (99.7396) acc5: 100.0000 (100.0000) time: 0.0382 data: 0.0002 max mem: 190
Epoch: [13] [ 30/468] eta: 0:00:18 lr: 0.00019098300562505265 img/s: 3511.117366225001 loss: 0.5056 (0.5099) acc1: 100.0000 (99.6976) acc5: 100.0000 (100.0000) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [13] [ 40/468] eta: 0:00:18 lr: 0.00019098300562505265 img/s: 1363.658307488716 loss: 0.5043 (0.5086) acc1: 100.0000 (99.7523) acc5: 100.0000 (100.0000) time: 0.0407 data: 0.0003 max mem: 190
Epoch: [13] [ 50/468] eta: 0:00:18 lr: 0.00019098300562505265 img/s: 3069.394789350012 loss: 0.5024 (0.5089) acc1: 100.0000 (99.7243) acc5: 100.0000 (100.0000) time: 0.0470 data: 0.0023 max mem: 190
Epoch: [13] [ 60/468] eta: 0:00:17 lr: 0.00019098300562505265 img/s: 3438.2623442163103 loss: 0.5078 (0.5095) acc1: 99.2188 (99.6926) acc5: 100.0000 (99.9872) time: 0.0461 data: 0.0028 max mem: 190
Epoch: [13] [ 70/468] eta: 0:00:17 lr: 0.00019098300562505265 img/s: 3492.479358842586 loss: 0.5070 (0.5098) acc1: 99.2188 (99.6589) acc5: 100.0000 (99.9890) time: 0.0406 data: 0.0008 max mem: 190
Epoch: [13] [ 80/468] eta: 0:00:16 lr: 0.00019098300562505265 img/s: 3470.579680914333 loss: 0.5058 (0.5097) acc1: 99.2188 (99.6431) acc5: 100.0000 (99.9904) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [13] [ 90/468] eta: 0:00:15 lr: 0.00019098300562505265 img/s: 3411.7802208975713 loss: 0.5038 (0.5095) acc1: 100.0000 (99.6394) acc5: 100.0000 (99.9914) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [13] [100/468] eta: 0:00:15 lr: 0.00019098300562505265 img/s: 3476.692863618702 loss: 0.5039 (0.5095) acc1: 100.0000 (99.6364) acc5: 100.0000 (99.9923) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [13] [110/468] eta: 0:00:14 lr: 0.00019098300562505265 img/s: 3369.5955011046394 loss: 0.5042 (0.5100) acc1: 99.2188 (99.6199) acc5: 100.0000 (99.9930) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [13] [120/468] eta: 0:00:14 lr: 0.00019098300562505265 img/s: 2763.3305470342384 loss: 0.5049 (0.5098) acc1: 100.0000 (99.6320) acc5: 100.0000 (99.9935) time: 0.0390 data: 0.0002 max mem: 190
Epoch: [13] [130/468] eta: 0:00:13 lr: 0.00019098300562505265 img/s: 3448.089042459586 loss: 0.5049 (0.5101) acc1: 100.0000 (99.6243) acc5: 100.0000 (99.9940) time: 0.0398 data: 0.0003 max mem: 190
Epoch: [13] [140/468] eta: 0:00:13 lr: 0.00019098300562505265 img/s: 3369.5320559087686 loss: 0.5095 (0.5103) acc1: 99.2188 (99.6121) acc5: 100.0000 (99.9945) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [13] [150/468] eta: 0:00:12 lr: 0.00019098300562505265 img/s: 3471.432436277109 loss: 0.5060 (0.5102) acc1: 99.2188 (99.6171) acc5: 100.0000 (99.9948) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [13] [160/468] eta: 0:00:12 lr: 0.00019098300562505265 img/s: 3444.505187247775 loss: 0.5056 (0.5102) acc1: 100.0000 (99.6167) acc5: 100.0000 (99.9951) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [13] [170/468] eta: 0:00:11 lr: 0.00019098300562505265 img/s: 3307.5865570033575 loss: 0.5027 (0.5099) acc1: 100.0000 (99.6254) acc5: 100.0000 (99.9954) time: 0.0389 data: 0.0003 max mem: 190
Epoch: [13] [180/468] eta: 0:00:11 lr: 0.00019098300562505265 img/s: 3425.866161277766 loss: 0.5031 (0.5099) acc1: 100.0000 (99.6202) acc5: 100.0000 (99.9957) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [13] [190/468] eta: 0:00:11 lr: 0.00019098300562505265 img/s: 3484.432536978264 loss: 0.5074 (0.5100) acc1: 99.2188 (99.6073) acc5: 100.0000 (99.9959) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [13] [200/468] eta: 0:00:10 lr: 0.00019098300562505265 img/s: 3348.119189273464 loss: 0.5049 (0.5099) acc1: 100.0000 (99.6191) acc5: 100.0000 (99.9961) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [13] [210/468] eta: 0:00:10 lr: 0.00019098300562505265 img/s: 3368.7074857250423 loss: 0.5055 (0.5099) acc1: 100.0000 (99.6149) acc5: 100.0000 (99.9963) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [13] [220/468] eta: 0:00:09 lr: 0.00019098300562505265 img/s: 3569.2644483595386 loss: 0.5026 (0.5098) acc1: 100.0000 (99.6182) acc5: 100.0000 (99.9965) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [13] [230/468] eta: 0:00:09 lr: 0.00019098300562505265 img/s: 3386.0659337887014 loss: 0.5020 (0.5096) acc1: 100.0000 (99.6280) acc5: 100.0000 (99.9966) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [13] [240/468] eta: 0:00:08 lr: 0.00019098300562505265 img/s: 3378.607779588806 loss: 0.5026 (0.5096) acc1: 100.0000 (99.6304) acc5: 100.0000 (99.9968) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [13] [250/468] eta: 0:00:08 lr: 0.00019098300562505265 img/s: 3548.785468294521 loss: 0.5026 (0.5094) acc1: 100.0000 (99.6452) acc5: 100.0000 (99.9969) time: 0.0404 data: 0.0006 max mem: 190
Epoch: [13] [260/468] eta: 0:00:08 lr: 0.00019098300562505265 img/s: 3539.357040201469 loss: 0.5032 (0.5096) acc1: 100.0000 (99.6288) acc5: 100.0000 (99.9970) time: 0.0427 data: 0.0006 max mem: 190
Epoch: [13] [270/468] eta: 0:00:07 lr: 0.00019098300562505265 img/s: 3462.633343437796 loss: 0.5123 (0.5099) acc1: 99.2188 (99.6166) acc5: 100.0000 (99.9971) time: 0.0395 data: 0.0003 max mem: 190
Epoch: [13] [280/468] eta: 0:00:07 lr: 0.00019098300562505265 img/s: 3555.130432479323 loss: 0.5059 (0.5099) acc1: 99.2188 (99.6191) acc5: 100.0000 (99.9972) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [13] [290/468] eta: 0:00:07 lr: 0.00019098300562505265 img/s: 3626.9915214733046 loss: 0.5069 (0.5104) acc1: 99.2188 (99.6000) acc5: 100.0000 (99.9973) time: 0.0364 data: 0.0003 max mem: 190
Epoch: [13] [300/468] eta: 0:00:06 lr: 0.00019098300562505265 img/s: 3500.6319084009283 loss: 0.5151 (0.5105) acc1: 99.2188 (99.5977) acc5: 100.0000 (99.9948) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [13] [310/468] eta: 0:00:06 lr: 0.00019098300562505265 img/s: 3429.871409588061 loss: 0.5056 (0.5106) acc1: 100.0000 (99.5981) acc5: 100.0000 (99.9950) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [13] [320/468] eta: 0:00:05 lr: 0.00019098300562505265 img/s: 3511.392938898845 loss: 0.5083 (0.5109) acc1: 99.2188 (99.5887) acc5: 100.0000 (99.9951) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [13] [330/468] eta: 0:00:05 lr: 0.00019098300562505265 img/s: 3008.3037492365365 loss: 0.5073 (0.5108) acc1: 100.0000 (99.5917) acc5: 100.0000 (99.9953) time: 0.0458 data: 0.0019 max mem: 190
Epoch: [13] [340/468] eta: 0:00:05 lr: 0.00019098300562505265 img/s: 3446.9378567347017 loss: 0.5035 (0.5106) acc1: 100.0000 (99.6014) acc5: 100.0000 (99.9954) time: 0.0487 data: 0.0032 max mem: 190
Epoch: [13] [350/468] eta: 0:00:04 lr: 0.00019098300562505265 img/s: 3523.006181507973 loss: 0.5034 (0.5106) acc1: 100.0000 (99.5971) acc5: 100.0000 (99.9955) time: 0.0402 data: 0.0015 max mem: 190
Epoch: [13] [360/468] eta: 0:00:04 lr: 0.00019098300562505265 img/s: 3495.3898720002085 loss: 0.5038 (0.5106) acc1: 100.0000 (99.5996) acc5: 100.0000 (99.9957) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [13] [370/468] eta: 0:00:03 lr: 0.00019098300562505265 img/s: 3456.858794895239 loss: 0.5027 (0.5105) acc1: 100.0000 (99.6062) acc5: 100.0000 (99.9958) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [13] [380/468] eta: 0:00:03 lr: 0.00019098300562505265 img/s: 3468.7857752048176 loss: 0.5026 (0.5104) acc1: 100.0000 (99.6125) acc5: 100.0000 (99.9959) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [13] [390/468] eta: 0:00:03 lr: 0.00019098300562505265 img/s: 3539.4037077081302 loss: 0.5050 (0.5105) acc1: 100.0000 (99.6084) acc5: 100.0000 (99.9960) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [13] [400/468] eta: 0:00:02 lr: 0.00019098300562505265 img/s: 3395.0592981857044 loss: 0.5056 (0.5105) acc1: 100.0000 (99.6103) acc5: 100.0000 (99.9961) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [13] [410/468] eta: 0:00:02 lr: 0.00019098300562505265 img/s: 3485.880490607936 loss: 0.5043 (0.5105) acc1: 100.0000 (99.6084) acc5: 100.0000 (99.9962) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [13] [420/468] eta: 0:00:01 lr: 0.00019098300562505265 img/s: 3440.598000512689 loss: 0.5027 (0.5104) acc1: 100.0000 (99.6122) acc5: 100.0000 (99.9963) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [13] [430/468] eta: 0:00:01 lr: 0.00019098300562505265 img/s: 3459.130640962862 loss: 0.5031 (0.5104) acc1: 100.0000 (99.6103) acc5: 100.0000 (99.9964) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [13] [440/468] eta: 0:00:01 lr: 0.00019098300562505265 img/s: 3513.5071006923995 loss: 0.5042 (0.5105) acc1: 100.0000 (99.6120) acc5: 100.0000 (99.9947) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [13] [450/468] eta: 0:00:00 lr: 0.00019098300562505265 img/s: 3505.61498178209 loss: 0.5024 (0.5103) acc1: 100.0000 (99.6189) acc5: 100.0000 (99.9948) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [13] [460/468] eta: 0:00:00 lr: 0.00019098300562505265 img/s: 3506.6453648245274 loss: 0.5038 (0.5103) acc1: 100.0000 (99.6153) acc5: 100.0000 (99.9949) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [13] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:14 loss: 0.5582 (0.5582) acc1: 97.6562 (97.6562) acc5: 100.0000 (100.0000) time: 0.1795 data: 0.1629 max mem: 190
Test: Total time: 0:00:02
Test: Acc@1 98.670 Acc@5 99.910
Epoch: [14] [ 0/468] eta: 0:01:18 lr: 8.645454235739902e-05 img/s: 2322.0256738521157 loss: 0.5286 (0.5286) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1679 data: 0.1128 max mem: 190
Epoch: [14] [ 10/468] eta: 0:00:24 lr: 8.645454235739902e-05 img/s: 3280.4039594280825 loss: 0.5064 (0.5121) acc1: 100.0000 (99.5739) acc5: 100.0000 (100.0000) time: 0.0536 data: 0.0106 max mem: 190
Epoch: [14] [ 20/468] eta: 0:00:20 lr: 8.645454235739902e-05 img/s: 3501.63652491521 loss: 0.5060 (0.5105) acc1: 100.0000 (99.5908) acc5: 100.0000 (100.0000) time: 0.0406 data: 0.0003 max mem: 190
Epoch: [14] [ 30/468] eta: 0:00:19 lr: 8.645454235739902e-05 img/s: 3416.948268839104 loss: 0.5057 (0.5089) acc1: 100.0000 (99.6472) acc5: 100.0000 (100.0000) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [14] [ 40/468] eta: 0:00:17 lr: 8.645454235739902e-05 img/s: 3601.3718824208113 loss: 0.5030 (0.5086) acc1: 100.0000 (99.6951) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [14] [ 50/468] eta: 0:00:17 lr: 8.645454235739902e-05 img/s: 3491.0940220960706 loss: 0.5035 (0.5095) acc1: 100.0000 (99.6630) acc5: 100.0000 (100.0000) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [14] [ 60/468] eta: 0:00:16 lr: 8.645454235739902e-05 img/s: 3246.169034863894 loss: 0.5039 (0.5095) acc1: 100.0000 (99.6670) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0008 max mem: 190
Epoch: [14] [ 70/468] eta: 0:00:16 lr: 8.645454235739902e-05 img/s: 3279.902935516388 loss: 0.5030 (0.5087) acc1: 100.0000 (99.6919) acc5: 100.0000 (100.0000) time: 0.0399 data: 0.0008 max mem: 190
Epoch: [14] [ 80/468] eta: 0:00:15 lr: 8.645454235739902e-05 img/s: 1911.368477266335 loss: 0.5030 (0.5087) acc1: 100.0000 (99.7010) acc5: 100.0000 (99.9904) time: 0.0412 data: 0.0003 max mem: 190
Epoch: [14] [ 90/468] eta: 0:00:16 lr: 8.645454235739902e-05 img/s: 2397.2909546369933 loss: 0.5029 (0.5091) acc1: 100.0000 (99.6995) acc5: 100.0000 (99.9914) time: 0.0512 data: 0.0023 max mem: 190
Epoch: [14] [100/468] eta: 0:00:15 lr: 8.645454235739902e-05 img/s: 3118.043186860415 loss: 0.5028 (0.5085) acc1: 100.0000 (99.7215) acc5: 100.0000 (99.9923) time: 0.0517 data: 0.0030 max mem: 190
Epoch: [14] [110/468] eta: 0:00:15 lr: 8.645454235739902e-05 img/s: 3220.793885619327 loss: 0.5028 (0.5086) acc1: 100.0000 (99.7185) acc5: 100.0000 (99.9930) time: 0.0421 data: 0.0010 max mem: 190
Epoch: [14] [120/468] eta: 0:00:14 lr: 8.645454235739902e-05 img/s: 3145.684992587992 loss: 0.5033 (0.5085) acc1: 100.0000 (99.7224) acc5: 100.0000 (99.9935) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [14] [130/468] eta: 0:00:14 lr: 8.645454235739902e-05 img/s: 3213.7759393725346 loss: 0.5029 (0.5083) acc1: 100.0000 (99.7197) acc5: 100.0000 (99.9940) time: 0.0420 data: 0.0003 max mem: 190
Epoch: [14] [140/468] eta: 0:00:13 lr: 8.645454235739902e-05 img/s: 2861.4496807410646 loss: 0.5028 (0.5083) acc1: 100.0000 (99.7174) acc5: 100.0000 (99.9945) time: 0.0417 data: 0.0003 max mem: 190
Epoch: [14] [150/468] eta: 0:00:13 lr: 8.645454235739902e-05 img/s: 2756.477801680983 loss: 0.5024 (0.5081) acc1: 100.0000 (99.7258) acc5: 100.0000 (99.9948) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [14] [160/468] eta: 0:00:12 lr: 8.645454235739902e-05 img/s: 3562.7271170806484 loss: 0.5023 (0.5078) acc1: 100.0000 (99.7428) acc5: 100.0000 (99.9951) time: 0.0386 data: 0.0003 max mem: 190
Epoch: [14] [170/468] eta: 0:00:12 lr: 8.645454235739902e-05 img/s: 3452.013271263599 loss: 0.5018 (0.5078) acc1: 100.0000 (99.7350) acc5: 100.0000 (99.9954) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [14] [180/468] eta: 0:00:11 lr: 8.645454235739902e-05 img/s: 3525.967818628417 loss: 0.5033 (0.5079) acc1: 100.0000 (99.7194) acc5: 100.0000 (99.9957) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [14] [190/468] eta: 0:00:11 lr: 8.645454235739902e-05 img/s: 3569.5254913433155 loss: 0.5049 (0.5080) acc1: 100.0000 (99.7137) acc5: 100.0000 (99.9959) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [14] [200/468] eta: 0:00:10 lr: 8.645454235739902e-05 img/s: 3462.164418190729 loss: 0.5022 (0.5078) acc1: 100.0000 (99.7240) acc5: 100.0000 (99.9961) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [14] [210/468] eta: 0:00:10 lr: 8.645454235739902e-05 img/s: 3587.1746846268975 loss: 0.5022 (0.5081) acc1: 100.0000 (99.7186) acc5: 100.0000 (99.9963) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [14] [220/468] eta: 0:00:10 lr: 8.645454235739902e-05 img/s: 3513.277177185038 loss: 0.5015 (0.5081) acc1: 100.0000 (99.7207) acc5: 100.0000 (99.9965) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [14] [230/468] eta: 0:00:09 lr: 8.645454235739902e-05 img/s: 3025.5794053335135 loss: 0.5026 (0.5081) acc1: 100.0000 (99.7193) acc5: 100.0000 (99.9966) time: 0.0430 data: 0.0003 max mem: 190
Epoch: [14] [240/468] eta: 0:00:09 lr: 8.645454235739902e-05 img/s: 3155.651043319814 loss: 0.5033 (0.5079) acc1: 100.0000 (99.7309) acc5: 100.0000 (99.9968) time: 0.0479 data: 0.0007 max mem: 190
Epoch: [14] [250/468] eta: 0:00:08 lr: 8.645454235739902e-05 img/s: 3337.5248633896767 loss: 0.5024 (0.5077) acc1: 100.0000 (99.7385) acc5: 100.0000 (99.9969) time: 0.0429 data: 0.0007 max mem: 190
Epoch: [14] [260/468] eta: 0:00:08 lr: 8.645454235739902e-05 img/s: 3527.5432145813897 loss: 0.5022 (0.5075) acc1: 100.0000 (99.7426) acc5: 100.0000 (99.9970) time: 0.0387 data: 0.0003 max mem: 190
Epoch: [14] [270/468] eta: 0:00:08 lr: 8.645454235739902e-05 img/s: 3607.8579627165573 loss: 0.5031 (0.5078) acc1: 100.0000 (99.7348) acc5: 100.0000 (99.9971) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [14] [280/468] eta: 0:00:07 lr: 8.645454235739902e-05 img/s: 3467.2176282920655 loss: 0.5059 (0.5078) acc1: 100.0000 (99.7331) acc5: 100.0000 (99.9972) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [14] [290/468] eta: 0:00:07 lr: 8.645454235739902e-05 img/s: 3557.085483336646 loss: 0.5026 (0.5077) acc1: 100.0000 (99.7396) acc5: 100.0000 (99.9973) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [14] [300/468] eta: 0:00:06 lr: 8.645454235739902e-05 img/s: 3545.7619739518664 loss: 0.5026 (0.5081) acc1: 100.0000 (99.7275) acc5: 100.0000 (99.9974) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [14] [310/468] eta: 0:00:06 lr: 8.645454235739902e-05 img/s: 3492.433920532903 loss: 0.5024 (0.5079) acc1: 100.0000 (99.7337) acc5: 100.0000 (99.9975) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [14] [320/468] eta: 0:00:05 lr: 8.645454235739902e-05 img/s: 3540.3840097070733 loss: 0.5016 (0.5078) acc1: 100.0000 (99.7396) acc5: 100.0000 (99.9976) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [14] [330/468] eta: 0:00:05 lr: 8.645454235739902e-05 img/s: 3359.495591557316 loss: 0.5020 (0.5077) acc1: 100.0000 (99.7404) acc5: 100.0000 (99.9976) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [14] [340/468] eta: 0:00:05 lr: 8.645454235739902e-05 img/s: 3507.4701074706823 loss: 0.5033 (0.5076) acc1: 100.0000 (99.7411) acc5: 100.0000 (99.9977) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [14] [350/468] eta: 0:00:04 lr: 8.645454235739902e-05 img/s: 3505.5920912587253 loss: 0.5036 (0.5078) acc1: 100.0000 (99.7396) acc5: 100.0000 (99.9978) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [14] [360/468] eta: 0:00:04 lr: 8.645454235739902e-05 img/s: 3241.112941006013 loss: 0.5040 (0.5078) acc1: 100.0000 (99.7381) acc5: 100.0000 (99.9978) time: 0.0428 data: 0.0006 max mem: 190
Epoch: [14] [370/468] eta: 0:00:03 lr: 8.645454235739902e-05 img/s: 3499.468187595737 loss: 0.5020 (0.5078) acc1: 100.0000 (99.7410) acc5: 100.0000 (99.9979) time: 0.0486 data: 0.0030 max mem: 190
Epoch: [14] [380/468] eta: 0:00:03 lr: 8.645454235739902e-05 img/s: 3273.283777192469 loss: 0.5021 (0.5079) acc1: 100.0000 (99.7396) acc5: 100.0000 (99.9979) time: 0.0429 data: 0.0026 max mem: 190
Epoch: [14] [390/468] eta: 0:00:03 lr: 8.645454235739902e-05 img/s: 3569.5254913433155 loss: 0.5042 (0.5079) acc1: 100.0000 (99.7383) acc5: 100.0000 (99.9980) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [14] [400/468] eta: 0:00:02 lr: 8.645454235739902e-05 img/s: 3462.8343502883163 loss: 0.5022 (0.5080) acc1: 100.0000 (99.7350) acc5: 100.0000 (99.9981) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [14] [410/468] eta: 0:00:02 lr: 8.645454235739902e-05 img/s: 3365.64531235307 loss: 0.5032 (0.5081) acc1: 100.0000 (99.7320) acc5: 100.0000 (99.9962) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [14] [420/468] eta: 0:00:01 lr: 8.645454235739902e-05 img/s: 3430.1343760382324 loss: 0.5045 (0.5080) acc1: 100.0000 (99.7328) acc5: 100.0000 (99.9963) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [14] [430/468] eta: 0:00:01 lr: 8.645454235739902e-05 img/s: 3436.6776683864855 loss: 0.5019 (0.5079) acc1: 100.0000 (99.7372) acc5: 100.0000 (99.9964) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [14] [440/468] eta: 0:00:01 lr: 8.645454235739902e-05 img/s: 3557.2740355945457 loss: 0.5018 (0.5077) acc1: 100.0000 (99.7431) acc5: 100.0000 (99.9965) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [14] [450/468] eta: 0:00:00 lr: 8.645454235739902e-05 img/s: 3501.362481412882 loss: 0.5031 (0.5078) acc1: 100.0000 (99.7384) acc5: 100.0000 (99.9965) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [14] [460/468] eta: 0:00:00 lr: 8.645454235739902e-05 img/s: 3422.76470325719 loss: 0.5031 (0.5077) acc1: 100.0000 (99.7407) acc5: 100.0000 (99.9966) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [14] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5510 (0.5510) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1211 data: 0.1021 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.750 Acc@5 99.920
Epoch: [15] [ 0/468] eta: 0:01:14 lr: 2.1852399266194312e-05 img/s: 2757.893798127058 loss: 0.5055 (0.5055) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1598 data: 0.1133 max mem: 190
Epoch: [15] [ 10/468] eta: 0:00:22 lr: 2.1852399266194312e-05 img/s: 3457.9943447876076 loss: 0.5040 (0.5057) acc1: 100.0000 (99.7869) acc5: 100.0000 (100.0000) time: 0.0488 data: 0.0105 max mem: 190
Epoch: [15] [ 20/468] eta: 0:00:19 lr: 2.1852399266194312e-05 img/s: 3491.298346924708 loss: 0.5027 (0.5084) acc1: 100.0000 (99.7396) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [15] [ 30/468] eta: 0:00:18 lr: 2.1852399266194312e-05 img/s: 3435.9738368 loss: 0.5024 (0.5065) acc1: 100.0000 (99.7984) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [15] [ 40/468] eta: 0:00:17 lr: 2.1852399266194312e-05 img/s: 3355.107689231077 loss: 0.5020 (0.5062) acc1: 100.0000 (99.8285) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [15] [ 50/468] eta: 0:00:16 lr: 2.1852399266194312e-05 img/s: 3455.2569347011804 loss: 0.5019 (0.5060) acc1: 100.0000 (99.8468) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [15] [ 60/468] eta: 0:00:16 lr: 2.1852399266194312e-05 img/s: 3422.2628827864046 loss: 0.5019 (0.5068) acc1: 100.0000 (99.8207) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [15] [ 70/468] eta: 0:00:15 lr: 2.1852399266194312e-05 img/s: 3367.4185823334233 loss: 0.5016 (0.5069) acc1: 100.0000 (99.8239) acc5: 100.0000 (100.0000) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [15] [ 80/468] eta: 0:00:15 lr: 2.1852399266194312e-05 img/s: 3468.8081875803605 loss: 0.5019 (0.5069) acc1: 100.0000 (99.8264) acc5: 100.0000 (99.9904) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [15] [ 90/468] eta: 0:00:14 lr: 2.1852399266194312e-05 img/s: 3418.9501999643376 loss: 0.5028 (0.5071) acc1: 100.0000 (99.8025) acc5: 100.0000 (99.9914) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [15] [100/468] eta: 0:00:14 lr: 2.1852399266194312e-05 img/s: 3447.491215452584 loss: 0.5026 (0.5069) acc1: 100.0000 (99.8066) acc5: 100.0000 (99.9923) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [15] [110/468] eta: 0:00:13 lr: 2.1852399266194312e-05 img/s: 3479.8252020663595 loss: 0.5016 (0.5068) acc1: 100.0000 (99.8100) acc5: 100.0000 (99.9930) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [15] [120/468] eta: 0:00:13 lr: 2.1852399266194312e-05 img/s: 3482.601694365521 loss: 0.5021 (0.5070) acc1: 100.0000 (99.7998) acc5: 100.0000 (99.9935) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [15] [130/468] eta: 0:00:13 lr: 2.1852399266194312e-05 img/s: 1422.856818766084 loss: 0.5022 (0.5067) acc1: 100.0000 (99.8092) acc5: 100.0000 (99.9940) time: 0.0414 data: 0.0003 max mem: 190
Epoch: [15] [140/468] eta: 0:00:13 lr: 2.1852399266194312e-05 img/s: 2415.8780340734206 loss: 0.5016 (0.5069) acc1: 100.0000 (99.8005) acc5: 100.0000 (99.9945) time: 0.0490 data: 0.0027 max mem: 190
Epoch: [15] [150/468] eta: 0:00:12 lr: 2.1852399266194312e-05 img/s: 3300.9161906752825 loss: 0.5014 (0.5066) acc1: 100.0000 (99.8086) acc5: 100.0000 (99.9948) time: 0.0458 data: 0.0027 max mem: 190
Epoch: [15] [160/468] eta: 0:00:12 lr: 2.1852399266194312e-05 img/s: 3474.397898033937 loss: 0.5013 (0.5064) acc1: 100.0000 (99.8156) acc5: 100.0000 (99.9951) time: 0.0396 data: 0.0003 max mem: 190
Epoch: [15] [170/468] eta: 0:00:11 lr: 2.1852399266194312e-05 img/s: 3440.487756736839 loss: 0.5015 (0.5061) acc1: 100.0000 (99.8264) acc5: 100.0000 (99.9954) time: 0.0391 data: 0.0003 max mem: 190
Epoch: [15] [180/468] eta: 0:00:11 lr: 2.1852399266194312e-05 img/s: 3407.2763921150504 loss: 0.5015 (0.5062) acc1: 100.0000 (99.8273) acc5: 100.0000 (99.9957) time: 0.0378 data: 0.0002 max mem: 190
Epoch: [15] [190/468] eta: 0:00:11 lr: 2.1852399266194312e-05 img/s: 3543.2814055095764 loss: 0.5026 (0.5062) acc1: 100.0000 (99.8200) acc5: 100.0000 (99.9959) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [15] [200/468] eta: 0:00:10 lr: 2.1852399266194312e-05 img/s: 3530.1642677257514 loss: 0.5032 (0.5064) acc1: 100.0000 (99.8134) acc5: 100.0000 (99.9961) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [15] [210/468] eta: 0:00:10 lr: 2.1852399266194312e-05 img/s: 3561.9234499917065 loss: 0.5021 (0.5062) acc1: 100.0000 (99.8223) acc5: 100.0000 (99.9963) time: 0.0372 data: 0.0005 max mem: 190
Epoch: [15] [220/468] eta: 0:00:09 lr: 2.1852399266194312e-05 img/s: 3406.4332476761524 loss: 0.5014 (0.5061) acc1: 100.0000 (99.8268) acc5: 100.0000 (99.9965) time: 0.0373 data: 0.0005 max mem: 190
Epoch: [15] [230/468] eta: 0:00:09 lr: 2.1852399266194312e-05 img/s: 2149.091168193809 loss: 0.5022 (0.5064) acc1: 100.0000 (99.8174) acc5: 100.0000 (99.9966) time: 0.0398 data: 0.0003 max mem: 190
Epoch: [15] [240/468] eta: 0:00:09 lr: 2.1852399266194312e-05 img/s: 1784.0806850921997 loss: 0.5031 (0.5063) acc1: 100.0000 (99.8185) acc5: 100.0000 (99.9968) time: 0.0456 data: 0.0003 max mem: 190
Epoch: [15] [250/468] eta: 0:00:08 lr: 2.1852399266194312e-05 img/s: 2532.445798761303 loss: 0.5021 (0.5064) acc1: 100.0000 (99.8164) acc5: 100.0000 (99.9969) time: 0.0536 data: 0.0026 max mem: 190
Epoch: [15] [260/468] eta: 0:00:08 lr: 2.1852399266194312e-05 img/s: 2783.5606619934465 loss: 0.5015 (0.5062) acc1: 100.0000 (99.8234) acc5: 100.0000 (99.9970) time: 0.0567 data: 0.0063 max mem: 190
Epoch: [15] [270/468] eta: 0:00:08 lr: 2.1852399266194312e-05 img/s: 3299.638071122147 loss: 0.5016 (0.5064) acc1: 100.0000 (99.8155) acc5: 100.0000 (99.9971) time: 0.0464 data: 0.0040 max mem: 190
Epoch: [15] [280/468] eta: 0:00:07 lr: 2.1852399266194312e-05 img/s: 3415.40490231629 loss: 0.5020 (0.5065) acc1: 100.0000 (99.8165) acc5: 100.0000 (99.9972) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [15] [290/468] eta: 0:00:07 lr: 2.1852399266194312e-05 img/s: 3205.0463977839863 loss: 0.5019 (0.5065) acc1: 100.0000 (99.8148) acc5: 100.0000 (99.9973) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [15] [300/468] eta: 0:00:06 lr: 2.1852399266194312e-05 img/s: 3537.0485357578154 loss: 0.5014 (0.5064) acc1: 100.0000 (99.8183) acc5: 100.0000 (99.9974) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [15] [310/468] eta: 0:00:06 lr: 2.1852399266194312e-05 img/s: 3132.0132077893286 loss: 0.5016 (0.5064) acc1: 100.0000 (99.8216) acc5: 100.0000 (99.9975) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [15] [320/468] eta: 0:00:06 lr: 2.1852399266194312e-05 img/s: 3513.254186489369 loss: 0.5017 (0.5063) acc1: 100.0000 (99.8223) acc5: 100.0000 (99.9976) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [15] [330/468] eta: 0:00:05 lr: 2.1852399266194312e-05 img/s: 3602.870319168926 loss: 0.5017 (0.5062) acc1: 100.0000 (99.8253) acc5: 100.0000 (99.9976) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [15] [340/468] eta: 0:00:05 lr: 2.1852399266194312e-05 img/s: 3537.328194079316 loss: 0.5017 (0.5061) acc1: 100.0000 (99.8305) acc5: 100.0000 (99.9977) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [15] [350/468] eta: 0:00:04 lr: 2.1852399266194312e-05 img/s: 3544.029890550942 loss: 0.5014 (0.5060) acc1: 100.0000 (99.8331) acc5: 100.0000 (99.9978) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [15] [360/468] eta: 0:00:04 lr: 2.1852399266194312e-05 img/s: 3440.9067206747595 loss: 0.5013 (0.5060) acc1: 100.0000 (99.8312) acc5: 100.0000 (99.9978) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [15] [370/468] eta: 0:00:03 lr: 2.1852399266194312e-05 img/s: 3565.566490227202 loss: 0.5013 (0.5059) acc1: 100.0000 (99.8315) acc5: 100.0000 (99.9979) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [15] [380/468] eta: 0:00:03 lr: 2.1852399266194312e-05 img/s: 3579.234859596256 loss: 0.5012 (0.5059) acc1: 100.0000 (99.8339) acc5: 100.0000 (99.9979) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [15] [390/468] eta: 0:00:03 lr: 2.1852399266194312e-05 img/s: 3276.220102642965 loss: 0.5025 (0.5059) acc1: 100.0000 (99.8322) acc5: 100.0000 (99.9980) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [15] [400/468] eta: 0:00:02 lr: 2.1852399266194312e-05 img/s: 2341.0613270076046 loss: 0.5033 (0.5059) acc1: 100.0000 (99.8344) acc5: 100.0000 (99.9981) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [15] [410/468] eta: 0:00:02 lr: 2.1852399266194312e-05 img/s: 2289.818783587819 loss: 0.5022 (0.5059) acc1: 100.0000 (99.8327) acc5: 100.0000 (99.9981) time: 0.0504 data: 0.0026 max mem: 190
Epoch: [15] [420/468] eta: 0:00:01 lr: 2.1852399266194312e-05 img/s: 3458.3730272227162 loss: 0.5017 (0.5060) acc1: 100.0000 (99.8311) acc5: 100.0000 (99.9981) time: 0.0494 data: 0.0026 max mem: 190
Epoch: [15] [430/468] eta: 0:00:01 lr: 2.1852399266194312e-05 img/s: 3442.3407903257867 loss: 0.5016 (0.5060) acc1: 100.0000 (99.8296) acc5: 100.0000 (99.9982) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [15] [440/468] eta: 0:00:01 lr: 2.1852399266194312e-05 img/s: 3324.9780882662603 loss: 0.5012 (0.5059) acc1: 100.0000 (99.8335) acc5: 100.0000 (99.9982) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [15] [450/468] eta: 0:00:00 lr: 2.1852399266194312e-05 img/s: 3340.619202289839 loss: 0.5010 (0.5058) acc1: 100.0000 (99.8354) acc5: 100.0000 (99.9983) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [15] [460/468] eta: 0:00:00 lr: 2.1852399266194312e-05 img/s: 3465.6285269796595 loss: 0.5015 (0.5059) acc1: 100.0000 (99.8339) acc5: 100.0000 (99.9983) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [15] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1216 data: 0.1018 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [16] [ 0/468] eta: 0:01:19 lr: 0.0 img/s: 1982.2950379015851 loss: 0.5016 (0.5016) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1703 data: 0.1057 max mem: 190
Epoch: [16] [ 10/468] eta: 0:00:23 lr: 0.0 img/s: 3292.55539201256 loss: 0.5020 (0.5069) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0522 data: 0.0099 max mem: 190
Epoch: [16] [ 20/468] eta: 0:00:20 lr: 0.0 img/s: 3561.07290346973 loss: 0.5019 (0.5053) acc1: 100.0000 (99.8884) acc5: 100.0000 (100.0000) time: 0.0401 data: 0.0003 max mem: 190
Epoch: [16] [ 30/468] eta: 0:00:18 lr: 0.0 img/s: 3545.9024873518883 loss: 0.5019 (0.5074) acc1: 100.0000 (99.7984) acc5: 100.0000 (100.0000) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [16] [ 40/468] eta: 0:00:17 lr: 0.0 img/s: 3467.934319488405 loss: 0.5022 (0.5069) acc1: 100.0000 (99.8095) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [16] [ 50/468] eta: 0:00:17 lr: 0.0 img/s: 3523.9311585165738 loss: 0.5020 (0.5070) acc1: 100.0000 (99.8162) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [16] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3601.251095056983 loss: 0.5017 (0.5061) acc1: 100.0000 (99.8463) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [ 70/468] eta: 0:00:15 lr: 0.0 img/s: 3460.6022508991996 loss: 0.5014 (0.5060) acc1: 100.0000 (99.8460) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3590.8215527850607 loss: 0.5012 (0.5058) acc1: 100.0000 (99.8553) acc5: 100.0000 (100.0000) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [16] [ 90/468] eta: 0:00:14 lr: 0.0 img/s: 3534.2311166115887 loss: 0.5014 (0.5057) acc1: 100.0000 (99.8541) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [16] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3035.878988023207 loss: 0.5015 (0.5057) acc1: 100.0000 (99.8608) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [16] [110/468] eta: 0:00:13 lr: 0.0 img/s: 3506.8744210959494 loss: 0.5018 (0.5055) acc1: 100.0000 (99.8663) acc5: 100.0000 (100.0000) time: 0.0386 data: 0.0003 max mem: 190
Epoch: [16] [120/468] eta: 0:00:13 lr: 0.0 img/s: 3458.4398621444907 loss: 0.5019 (0.5057) acc1: 100.0000 (99.8644) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0002 max mem: 190
Epoch: [16] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3514.818239549576 loss: 0.5016 (0.5056) acc1: 100.0000 (99.8688) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [140/468] eta: 0:00:12 lr: 0.0 img/s: 3442.8043606515325 loss: 0.5016 (0.5056) acc1: 100.0000 (99.8670) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [16] [150/468] eta: 0:00:12 lr: 0.0 img/s: 3508.180613458447 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8707) acc5: 100.0000 (100.0000) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [16] [160/468] eta: 0:00:11 lr: 0.0 img/s: 3484.9753786034676 loss: 0.5014 (0.5057) acc1: 100.0000 (99.8593) acc5: 100.0000 (100.0000) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [16] [170/468] eta: 0:00:11 lr: 0.0 img/s: 1982.7416128698683 loss: 0.5026 (0.5058) acc1: 100.0000 (99.8538) acc5: 100.0000 (100.0000) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [16] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3118.242398545632 loss: 0.5021 (0.5058) acc1: 100.0000 (99.8532) acc5: 100.0000 (100.0000) time: 0.0478 data: 0.0035 max mem: 190
Epoch: [16] [190/468] eta: 0:00:11 lr: 0.0 img/s: 3482.8276202091497 loss: 0.5016 (0.5056) acc1: 100.0000 (99.8568) acc5: 100.0000 (100.0000) time: 0.0482 data: 0.0044 max mem: 190
Epoch: [16] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3515.3475727138198 loss: 0.5014 (0.5056) acc1: 100.0000 (99.8601) acc5: 100.0000 (100.0000) time: 0.0395 data: 0.0011 max mem: 190
Epoch: [16] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3514.680178853166 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8667) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [16] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3472.6674299316296 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8657) acc5: 100.0000 (100.0000) time: 0.0402 data: 0.0008 max mem: 190
Epoch: [16] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3407.752196211852 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8647) acc5: 100.0000 (100.0000) time: 0.0424 data: 0.0012 max mem: 190
Epoch: [16] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3496.163792654337 loss: 0.5015 (0.5055) acc1: 100.0000 (99.8606) acc5: 100.0000 (100.0000) time: 0.0393 data: 0.0007 max mem: 190
Epoch: [16] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3485.065868652182 loss: 0.5018 (0.5056) acc1: 100.0000 (99.8568) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [16] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3557.4861807796547 loss: 0.5018 (0.5056) acc1: 100.0000 (99.8593) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [16] [270/468] eta: 0:00:07 lr: 0.0 img/s: 3434.3473299045572 loss: 0.5021 (0.5055) acc1: 100.0000 (99.8616) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [16] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3463.7950385496306 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8665) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [16] [290/468] eta: 0:00:06 lr: 0.0 img/s: 3488.0546788204033 loss: 0.5012 (0.5055) acc1: 100.0000 (99.8631) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3538.9370880134998 loss: 0.5015 (0.5054) acc1: 100.0000 (99.8650) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3528.9344393758133 loss: 0.5015 (0.5053) acc1: 100.0000 (99.8669) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [16] [320/468] eta: 0:00:05 lr: 0.0 img/s: 3418.710715172664 loss: 0.5019 (0.5056) acc1: 100.0000 (99.8588) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [16] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3593.008425857142 loss: 0.5018 (0.5055) acc1: 100.0000 (99.8631) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [16] [340/468] eta: 0:00:04 lr: 0.0 img/s: 3582.7460443513137 loss: 0.5013 (0.5054) acc1: 100.0000 (99.8648) acc5: 100.0000 (100.0000) time: 0.0365 data: 0.0003 max mem: 190
Epoch: [16] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3636.6961917277445 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8665) acc5: 100.0000 (100.0000) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [16] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3500.449312781994 loss: 0.5021 (0.5054) acc1: 100.0000 (99.8658) acc5: 100.0000 (100.0000) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [16] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3570.3563367449406 loss: 0.5021 (0.5054) acc1: 100.0000 (99.8673) acc5: 100.0000 (100.0000) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [16] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3518.6422246836064 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8667) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [16] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3430.5946643662737 loss: 0.5021 (0.5054) acc1: 100.0000 (99.8661) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [16] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3495.4126294818125 loss: 0.5019 (0.5054) acc1: 100.0000 (99.8656) acc5: 100.0000 (99.9981) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [16] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3535.2085550229153 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8650) acc5: 100.0000 (99.9981) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [16] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3448.155480481445 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8645) acc5: 100.0000 (99.9981) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [16] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3298.2798867133984 loss: 0.5025 (0.5054) acc1: 100.0000 (99.8659) acc5: 100.0000 (99.9982) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [16] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3572.4945734267594 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9982) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [16] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3348.2653561427687 loss: 0.5011 (0.5054) acc1: 100.0000 (99.8666) acc5: 100.0000 (99.9983) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [16] [460/468] eta: 0:00:00 lr: 0.0 img/s: 2339.847162961381 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9983) time: 0.0439 data: 0.0006 max mem: 190
Epoch: [16] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:12 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1549 data: 0.1313 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [17] [ 0/468] eta: 0:01:14 lr: 0.0 img/s: 2452.9998766351555 loss: 0.5031 (0.5031) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1589 data: 0.1066 max mem: 190
Epoch: [17] [ 10/468] eta: 0:00:22 lr: 0.0 img/s: 3470.1086010871745 loss: 0.5013 (0.5042) acc1: 100.0000 (99.9290) acc5: 100.0000 (100.0000) time: 0.0490 data: 0.0100 max mem: 190
Epoch: [17] [ 20/468] eta: 0:00:19 lr: 0.0 img/s: 2451.6332716852753 loss: 0.5011 (0.5029) acc1: 100.0000 (99.9628) acc5: 100.0000 (100.0000) time: 0.0386 data: 0.0003 max mem: 190
Epoch: [17] [ 30/468] eta: 0:00:21 lr: 0.0 img/s: 1973.3548187899728 loss: 0.5012 (0.5050) acc1: 100.0000 (99.8992) acc5: 100.0000 (99.9748) time: 0.0498 data: 0.0035 max mem: 190
Epoch: [17] [ 40/468] eta: 0:00:22 lr: 0.0 img/s: 2807.2709171054626 loss: 0.5012 (0.5047) acc1: 100.0000 (99.9047) acc5: 100.0000 (99.9809) time: 0.0589 data: 0.0053 max mem: 190
Epoch: [17] [ 50/468] eta: 0:00:21 lr: 0.0 img/s: 3277.0000122077763 loss: 0.5015 (0.5047) acc1: 100.0000 (99.9081) acc5: 100.0000 (99.9847) time: 0.0519 data: 0.0032 max mem: 190
Epoch: [17] [ 60/468] eta: 0:00:19 lr: 0.0 img/s: 3144.763687697327 loss: 0.5018 (0.5050) acc1: 100.0000 (99.8847) acc5: 100.0000 (99.9872) time: 0.0432 data: 0.0014 max mem: 190
Epoch: [17] [ 70/468] eta: 0:00:18 lr: 0.0 img/s: 3306.7716547072773 loss: 0.5023 (0.5064) acc1: 100.0000 (99.8460) acc5: 100.0000 (99.9890) time: 0.0393 data: 0.0003 max mem: 190
Epoch: [17] [ 80/468] eta: 0:00:17 lr: 0.0 img/s: 3440.2232004972543 loss: 0.5015 (0.5064) acc1: 100.0000 (99.8457) acc5: 100.0000 (99.9904) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [17] [ 90/468] eta: 0:00:17 lr: 0.0 img/s: 3507.928465484008 loss: 0.5013 (0.5062) acc1: 100.0000 (99.8369) acc5: 100.0000 (99.9914) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [17] [100/468] eta: 0:00:16 lr: 0.0 img/s: 3577.9944551076987 loss: 0.5023 (0.5063) acc1: 100.0000 (99.8376) acc5: 100.0000 (99.9923) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [17] [110/468] eta: 0:00:15 lr: 0.0 img/s: 3463.035380477201 loss: 0.5026 (0.5063) acc1: 100.0000 (99.8311) acc5: 100.0000 (99.9930) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [17] [120/468] eta: 0:00:15 lr: 0.0 img/s: 3505.0428083645074 loss: 0.5021 (0.5059) acc1: 100.0000 (99.8450) acc5: 100.0000 (99.9935) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [17] [130/468] eta: 0:00:14 lr: 0.0 img/s: 3500.129816280495 loss: 0.5015 (0.5058) acc1: 100.0000 (99.8449) acc5: 100.0000 (99.9940) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [17] [140/468] eta: 0:00:13 lr: 0.0 img/s: 3441.568450473089 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8559) acc5: 100.0000 (99.9945) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [17] [150/468] eta: 0:00:13 lr: 0.0 img/s: 3556.6141901291817 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8655) acc5: 100.0000 (99.9948) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [17] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3476.3551785540844 loss: 0.5018 (0.5055) acc1: 100.0000 (99.8593) acc5: 100.0000 (99.9951) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [17] [170/468] eta: 0:00:12 lr: 0.0 img/s: 3535.7440480502632 loss: 0.5022 (0.5056) acc1: 100.0000 (99.8584) acc5: 100.0000 (99.9954) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [17] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3560.3172030531923 loss: 0.5022 (0.5056) acc1: 100.0000 (99.8532) acc5: 100.0000 (99.9957) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [190/468] eta: 0:00:11 lr: 0.0 img/s: 3294.737658639566 loss: 0.5019 (0.5057) acc1: 100.0000 (99.8527) acc5: 100.0000 (99.9959) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3465.4048269140153 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8523) acc5: 100.0000 (99.9961) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [17] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3518.665285952103 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8556) acc5: 100.0000 (99.9963) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [17] [220/468] eta: 0:00:10 lr: 0.0 img/s: 1131.3978588889825 loss: 0.5014 (0.5053) acc1: 100.0000 (99.8621) acc5: 100.0000 (99.9965) time: 0.0452 data: 0.0003 max mem: 190
Epoch: [17] [230/468] eta: 0:00:09 lr: 0.0 img/s: 2135.9070318871713 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8647) acc5: 100.0000 (99.9966) time: 0.0558 data: 0.0021 max mem: 190
Epoch: [17] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3016.535442981076 loss: 0.5014 (0.5053) acc1: 100.0000 (99.8638) acc5: 100.0000 (99.9968) time: 0.0512 data: 0.0021 max mem: 190
Epoch: [17] [250/468] eta: 0:00:09 lr: 0.0 img/s: 2825.2497658215193 loss: 0.5018 (0.5052) acc1: 100.0000 (99.8693) acc5: 100.0000 (99.9969) time: 0.0456 data: 0.0004 max mem: 190
Epoch: [17] [260/468] eta: 0:00:08 lr: 0.0 img/s: 2882.9463170499885 loss: 0.5016 (0.5051) acc1: 100.0000 (99.8713) acc5: 100.0000 (99.9970) time: 0.0464 data: 0.0004 max mem: 190
Epoch: [17] [270/468] eta: 0:00:08 lr: 0.0 img/s: 3288.562068923273 loss: 0.5020 (0.5053) acc1: 100.0000 (99.8616) acc5: 100.0000 (99.9971) time: 0.0437 data: 0.0003 max mem: 190
Epoch: [17] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3306.242183506691 loss: 0.5015 (0.5052) acc1: 100.0000 (99.8638) acc5: 100.0000 (99.9972) time: 0.0411 data: 0.0003 max mem: 190
Epoch: [17] [290/468] eta: 0:00:07 lr: 0.0 img/s: 3393.085239374309 loss: 0.5015 (0.5052) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9973) time: 0.0401 data: 0.0003 max mem: 190
Epoch: [17] [300/468] eta: 0:00:07 lr: 0.0 img/s: 3480.4342966795025 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8598) acc5: 100.0000 (99.9974) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [17] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3473.678533069348 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8593) acc5: 100.0000 (99.9975) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [17] [320/468] eta: 0:00:06 lr: 0.0 img/s: 3557.3211767824014 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8613) acc5: 100.0000 (99.9976) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [17] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3493.024710796497 loss: 0.5015 (0.5054) acc1: 100.0000 (99.8607) acc5: 100.0000 (99.9976) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [17] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3491.5481097533216 loss: 0.5016 (0.5052) acc1: 100.0000 (99.8648) acc5: 100.0000 (99.9977) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3523.5148587629947 loss: 0.5014 (0.5052) acc1: 100.0000 (99.8687) acc5: 100.0000 (99.9978) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [17] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3506.46214135028 loss: 0.5014 (0.5051) acc1: 100.0000 (99.8702) acc5: 100.0000 (99.9978) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [370/468] eta: 0:00:04 lr: 0.0 img/s: 3623.613226331171 loss: 0.5015 (0.5052) acc1: 100.0000 (99.8652) acc5: 100.0000 (99.9979) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [17] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3479.9379812803027 loss: 0.5022 (0.5053) acc1: 100.0000 (99.8606) acc5: 100.0000 (99.9979) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [17] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3509.7630961331024 loss: 0.5022 (0.5053) acc1: 100.0000 (99.8621) acc5: 100.0000 (99.9980) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3538.9837444463487 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8636) acc5: 100.0000 (99.9981) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3372.5589366032614 loss: 0.5011 (0.5052) acc1: 100.0000 (99.8650) acc5: 100.0000 (99.9981) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3518.20411800941 loss: 0.5013 (0.5052) acc1: 100.0000 (99.8682) acc5: 100.0000 (99.9981) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [17] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3518.18106278547 loss: 0.5017 (0.5053) acc1: 100.0000 (99.8659) acc5: 100.0000 (99.9982) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [17] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3516.79830209814 loss: 0.5017 (0.5053) acc1: 100.0000 (99.8654) acc5: 100.0000 (99.9982) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [17] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3461.16000593116 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8632) acc5: 100.0000 (99.9983) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3561.0492829758164 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9983) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [17] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:10 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1292 data: 0.1093 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [18] [ 0/468] eta: 0:01:43 lr: 0.0 img/s: 1582.0426281777382 loss: 0.5010 (0.5010) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.2210 data: 0.1400 max mem: 190
Epoch: [18] [ 10/468] eta: 0:00:26 lr: 0.0 img/s: 3425.472707667375 loss: 0.5014 (0.5025) acc1: 100.0000 (99.9290) acc5: 100.0000 (100.0000) time: 0.0577 data: 0.0130 max mem: 190
Epoch: [18] [ 20/468] eta: 0:00:21 lr: 0.0 img/s: 3583.296036735947 loss: 0.5018 (0.5044) acc1: 100.0000 (99.8512) acc5: 100.0000 (100.0000) time: 0.0393 data: 0.0003 max mem: 190
Epoch: [18] [ 30/468] eta: 0:00:19 lr: 0.0 img/s: 3592.9122436004686 loss: 0.5023 (0.5055) acc1: 100.0000 (99.8236) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [18] [ 40/468] eta: 0:00:18 lr: 0.0 img/s: 3484.839652340986 loss: 0.5023 (0.5059) acc1: 100.0000 (99.8285) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [18] [ 50/468] eta: 0:00:17 lr: 0.0 img/s: 3621.437806918137 loss: 0.5021 (0.5062) acc1: 100.0000 (99.8162) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [18] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3508.08891909199 loss: 0.5019 (0.5055) acc1: 100.0000 (99.8463) acc5: 100.0000 (100.0000) time: 0.0364 data: 0.0003 max mem: 190
Epoch: [18] [ 70/468] eta: 0:00:15 lr: 0.0 img/s: 3505.271655316954 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8460) acc5: 100.0000 (100.0000) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [18] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3068.0616960117036 loss: 0.5018 (0.5057) acc1: 100.0000 (99.8457) acc5: 100.0000 (100.0000) time: 0.0387 data: 0.0002 max mem: 190
Epoch: [18] [ 90/468] eta: 0:00:15 lr: 0.0 img/s: 3119.982054336772 loss: 0.5018 (0.5058) acc1: 100.0000 (99.8455) acc5: 100.0000 (100.0000) time: 0.0412 data: 0.0003 max mem: 190
Epoch: [18] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3181.7023652192465 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8608) acc5: 100.0000 (100.0000) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [18] [110/468] eta: 0:00:14 lr: 0.0 img/s: 3458.952349045177 loss: 0.5017 (0.5057) acc1: 100.0000 (99.8522) acc5: 100.0000 (100.0000) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [18] [120/468] eta: 0:00:13 lr: 0.0 img/s: 3476.737893250787 loss: 0.5021 (0.5061) acc1: 100.0000 (99.8386) acc5: 100.0000 (100.0000) time: 0.0385 data: 0.0002 max mem: 190
Epoch: [18] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3438.6367170737017 loss: 0.5021 (0.5058) acc1: 100.0000 (99.8449) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [18] [140/468] eta: 0:00:12 lr: 0.0 img/s: 3564.4065329969458 loss: 0.5016 (0.5057) acc1: 100.0000 (99.8504) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [18] [150/468] eta: 0:00:12 lr: 0.0 img/s: 2829.6127294671987 loss: 0.5014 (0.5059) acc1: 100.0000 (99.8500) acc5: 100.0000 (100.0000) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [18] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3410.263180628605 loss: 0.5017 (0.5058) acc1: 100.0000 (99.8496) acc5: 100.0000 (100.0000) time: 0.0419 data: 0.0003 max mem: 190
Epoch: [18] [170/468] eta: 0:00:11 lr: 0.0 img/s: 3449.817263515033 loss: 0.5020 (0.5058) acc1: 100.0000 (99.8492) acc5: 100.0000 (100.0000) time: 0.0401 data: 0.0003 max mem: 190
Epoch: [18] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3422.459229793392 loss: 0.5015 (0.5058) acc1: 100.0000 (99.8489) acc5: 100.0000 (100.0000) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [18] [190/468] eta: 0:00:10 lr: 0.0 img/s: 3534.1147908973016 loss: 0.5017 (0.5058) acc1: 100.0000 (99.8527) acc5: 100.0000 (99.9959) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [18] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3625.668829984805 loss: 0.5018 (0.5057) acc1: 100.0000 (99.8562) acc5: 100.0000 (99.9961) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [18] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3558.1463498691055 loss: 0.5016 (0.5059) acc1: 100.0000 (99.8519) acc5: 100.0000 (99.9963) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [18] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3487.5335325451474 loss: 0.5018 (0.5059) acc1: 100.0000 (99.8480) acc5: 100.0000 (99.9965) time: 0.0404 data: 0.0006 max mem: 190
Epoch: [18] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3458.0611650735577 loss: 0.5018 (0.5057) acc1: 100.0000 (99.8546) acc5: 100.0000 (99.9966) time: 0.0432 data: 0.0013 max mem: 190
Epoch: [18] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3391.713334470494 loss: 0.5017 (0.5057) acc1: 100.0000 (99.8574) acc5: 100.0000 (99.9968) time: 0.0404 data: 0.0010 max mem: 190
Epoch: [18] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3510.267956035909 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8599) acc5: 100.0000 (99.9969) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [18] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3547.191045979214 loss: 0.5015 (0.5057) acc1: 100.0000 (99.8533) acc5: 100.0000 (99.9970) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [18] [270/468] eta: 0:00:07 lr: 0.0 img/s: 2666.648017166018 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8587) acc5: 100.0000 (99.9971) time: 0.0450 data: 0.0015 max mem: 190
Epoch: [18] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3161.4301815462345 loss: 0.5014 (0.5056) acc1: 100.0000 (99.8610) acc5: 100.0000 (99.9972) time: 0.0500 data: 0.0033 max mem: 190
Epoch: [18] [290/468] eta: 0:00:07 lr: 0.0 img/s: 2749.0535349449287 loss: 0.5010 (0.5054) acc1: 100.0000 (99.8658) acc5: 100.0000 (99.9973) time: 0.0479 data: 0.0037 max mem: 190
Epoch: [18] [300/468] eta: 0:00:06 lr: 0.0 img/s: 1436.9591023939017 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8676) acc5: 100.0000 (99.9974) time: 0.0534 data: 0.0050 max mem: 190
Epoch: [18] [310/468] eta: 0:00:06 lr: 0.0 img/s: 1795.242673514974 loss: 0.5012 (0.5053) acc1: 100.0000 (99.8694) acc5: 100.0000 (99.9975) time: 0.0713 data: 0.0073 max mem: 190
Epoch: [18] [320/468] eta: 0:00:06 lr: 0.0 img/s: 3321.3372184381633 loss: 0.5012 (0.5052) acc1: 100.0000 (99.8710) acc5: 100.0000 (99.9976) time: 0.0763 data: 0.0082 max mem: 190
Epoch: [18] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3390.8564571240895 loss: 0.5013 (0.5052) acc1: 100.0000 (99.8678) acc5: 100.0000 (99.9976) time: 0.0527 data: 0.0042 max mem: 190
Epoch: [18] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3459.866289448415 loss: 0.5017 (0.5053) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9977) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [18] [350/468] eta: 0:00:05 lr: 0.0 img/s: 3461.9411650911156 loss: 0.5018 (0.5052) acc1: 100.0000 (99.8709) acc5: 100.0000 (99.9978) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [18] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3527.357800817335 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8637) acc5: 100.0000 (99.9978) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [18] [370/468] eta: 0:00:04 lr: 0.0 img/s: 3433.3809475084417 loss: 0.5015 (0.5055) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9979) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [18] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3431.603346777544 loss: 0.5013 (0.5054) acc1: 100.0000 (99.8667) acc5: 100.0000 (99.9979) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [18] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3442.7602057175104 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9980) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [18] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3547.4723105081966 loss: 0.5018 (0.5055) acc1: 100.0000 (99.8636) acc5: 100.0000 (99.9981) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [18] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3446.229816734602 loss: 0.5017 (0.5055) acc1: 100.0000 (99.8650) acc5: 100.0000 (99.9981) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [18] [420/468] eta: 0:00:02 lr: 0.0 img/s: 3311.5649642240314 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8627) acc5: 100.0000 (99.9981) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [18] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3400.37059650634 loss: 0.5021 (0.5055) acc1: 100.0000 (99.8622) acc5: 100.0000 (99.9982) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [18] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3415.687386275433 loss: 0.5019 (0.5054) acc1: 100.0000 (99.8654) acc5: 100.0000 (99.9982) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [18] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3421.913877061928 loss: 0.5017 (0.5053) acc1: 100.0000 (99.8683) acc5: 100.0000 (99.9983) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [18] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3543.0241867893274 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8678) acc5: 100.0000 (99.9983) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [18] Total time: 0:00:19
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1238 data: 0.1070 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [19] [ 0/468] eta: 0:01:14 lr: 0.0 img/s: 2553.5975951408145 loss: 0.5011 (0.5011) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1595 data: 0.1094 max mem: 190
Epoch: [19] [ 10/468] eta: 0:00:22 lr: 0.0 img/s: 3504.425070823379 loss: 0.5017 (0.5064) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0489 data: 0.0105 max mem: 190
Epoch: [19] [ 20/468] eta: 0:00:24 lr: 0.0 img/s: 3486.65020554751 loss: 0.5017 (0.5057) acc1: 100.0000 (99.8884) acc5: 100.0000 (100.0000) time: 0.0484 data: 0.0032 max mem: 190
Epoch: [19] [ 30/468] eta: 0:00:21 lr: 0.0 img/s: 3525.944661539573 loss: 0.5015 (0.5067) acc1: 100.0000 (99.8488) acc5: 100.0000 (100.0000) time: 0.0498 data: 0.0034 max mem: 190
Epoch: [19] [ 40/468] eta: 0:00:19 lr: 0.0 img/s: 3489.2788519657097 loss: 0.5016 (0.5076) acc1: 100.0000 (99.8285) acc5: 100.0000 (100.0000) time: 0.0394 data: 0.0007 max mem: 190
Epoch: [19] [ 50/468] eta: 0:00:18 lr: 0.0 img/s: 3354.8351361315763 loss: 0.5016 (0.5069) acc1: 100.0000 (99.8468) acc5: 100.0000 (100.0000) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [19] [ 60/468] eta: 0:00:17 lr: 0.0 img/s: 3312.6683696765517 loss: 0.5014 (0.5068) acc1: 100.0000 (99.8463) acc5: 100.0000 (100.0000) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [19] [ 70/468] eta: 0:00:17 lr: 0.0 img/s: 2751.462736134316 loss: 0.5014 (0.5070) acc1: 100.0000 (99.8349) acc5: 100.0000 (100.0000) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [19] [ 80/468] eta: 0:00:16 lr: 0.0 img/s: 3579.8076440935642 loss: 0.5017 (0.5069) acc1: 100.0000 (99.8360) acc5: 100.0000 (100.0000) time: 0.0396 data: 0.0003 max mem: 190
Epoch: [19] [ 90/468] eta: 0:00:15 lr: 0.0 img/s: 3475.297523336052 loss: 0.5018 (0.5067) acc1: 100.0000 (99.8455) acc5: 100.0000 (99.9914) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [19] [100/468] eta: 0:00:15 lr: 0.0 img/s: 3478.4273468832407 loss: 0.5018 (0.5063) acc1: 100.0000 (99.8530) acc5: 100.0000 (99.9923) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [19] [110/468] eta: 0:00:14 lr: 0.0 img/s: 3450.704202901345 loss: 0.5015 (0.5061) acc1: 100.0000 (99.8522) acc5: 100.0000 (99.9930) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [19] [120/468] eta: 0:00:14 lr: 0.0 img/s: 3475.7925158617118 loss: 0.5014 (0.5058) acc1: 100.0000 (99.8644) acc5: 100.0000 (99.9935) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [19] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3526.477351550184 loss: 0.5016 (0.5058) acc1: 100.0000 (99.8569) acc5: 100.0000 (99.9940) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [19] [140/468] eta: 0:00:13 lr: 0.0 img/s: 3456.5917150620016 loss: 0.5015 (0.5055) acc1: 100.0000 (99.8615) acc5: 100.0000 (99.9945) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [19] [150/468] eta: 0:00:12 lr: 0.0 img/s: 3405.741748447382 loss: 0.5017 (0.5060) acc1: 100.0000 (99.8448) acc5: 100.0000 (99.9948) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [19] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3511.1403289624277 loss: 0.5019 (0.5060) acc1: 100.0000 (99.8447) acc5: 100.0000 (99.9951) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [19] [170/468] eta: 0:00:11 lr: 0.0 img/s: 3486.763427591671 loss: 0.5012 (0.5060) acc1: 100.0000 (99.8447) acc5: 100.0000 (99.9954) time: 0.0377 data: 0.0002 max mem: 190
Epoch: [19] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3573.5646524754716 loss: 0.5015 (0.5060) acc1: 100.0000 (99.8446) acc5: 100.0000 (99.9957) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [19] [190/468] eta: 0:00:11 lr: 0.0 img/s: 2244.987965309314 loss: 0.5017 (0.5059) acc1: 100.0000 (99.8487) acc5: 100.0000 (99.9959) time: 0.0387 data: 0.0003 max mem: 190
Epoch: [19] [200/468] eta: 0:00:10 lr: 0.0 img/s: 2122.91724958777 loss: 0.5017 (0.5059) acc1: 100.0000 (99.8484) acc5: 100.0000 (99.9961) time: 0.0410 data: 0.0005 max mem: 190
Epoch: [19] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3555.3894120607674 loss: 0.5016 (0.5058) acc1: 100.0000 (99.8556) acc5: 100.0000 (99.9963) time: 0.0397 data: 0.0008 max mem: 190
Epoch: [19] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3395.510220603117 loss: 0.5021 (0.5059) acc1: 100.0000 (99.8480) acc5: 100.0000 (99.9965) time: 0.0376 data: 0.0006 max mem: 190
Epoch: [19] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3594.1870765605336 loss: 0.5019 (0.5059) acc1: 100.0000 (99.8512) acc5: 100.0000 (99.9966) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [19] [240/468] eta: 0:00:08 lr: 0.0 img/s: 3571.2113241936236 loss: 0.5017 (0.5058) acc1: 100.0000 (99.8541) acc5: 100.0000 (99.9968) time: 0.0363 data: 0.0003 max mem: 190
Epoch: [19] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3560.0102913677174 loss: 0.5014 (0.5058) acc1: 100.0000 (99.8537) acc5: 100.0000 (99.9969) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [19] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3507.401364099616 loss: 0.5014 (0.5058) acc1: 100.0000 (99.8533) acc5: 100.0000 (99.9970) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [19] [270/468] eta: 0:00:07 lr: 0.0 img/s: 3432.788209341731 loss: 0.5022 (0.5058) acc1: 100.0000 (99.8530) acc5: 100.0000 (99.9971) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [19] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3539.5203818590576 loss: 0.5018 (0.5058) acc1: 100.0000 (99.8526) acc5: 100.0000 (99.9972) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [19] [290/468] eta: 0:00:06 lr: 0.0 img/s: 3547.2613579300682 loss: 0.5017 (0.5056) acc1: 100.0000 (99.8577) acc5: 100.0000 (99.9973) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [19] [300/468] eta: 0:00:06 lr: 0.0 img/s: 2699.906018667525 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8598) acc5: 100.0000 (99.9974) time: 0.0444 data: 0.0017 max mem: 190
Epoch: [19] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3534.952506995885 loss: 0.5015 (0.5054) acc1: 100.0000 (99.8643) acc5: 100.0000 (99.9975) time: 0.0490 data: 0.0022 max mem: 190
Epoch: [19] [320/468] eta: 0:00:05 lr: 0.0 img/s: 3387.0485972228357 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9976) time: 0.0421 data: 0.0008 max mem: 190
Epoch: [19] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3504.2649521882445 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8678) acc5: 100.0000 (99.9976) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [19] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3549.5832170791214 loss: 0.5013 (0.5052) acc1: 100.0000 (99.8694) acc5: 100.0000 (99.9977) time: 0.0370 data: 0.0002 max mem: 190
Epoch: [19] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3632.6606130320047 loss: 0.5011 (0.5051) acc1: 100.0000 (99.8731) acc5: 100.0000 (99.9978) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [19] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3434.5230942449907 loss: 0.5011 (0.5052) acc1: 100.0000 (99.8723) acc5: 100.0000 (99.9978) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [19] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3486.7860728829082 loss: 0.5017 (0.5051) acc1: 100.0000 (99.8737) acc5: 100.0000 (99.9979) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [19] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3491.0259191343816 loss: 0.5014 (0.5052) acc1: 100.0000 (99.8729) acc5: 100.0000 (99.9979) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [19] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3496.4597940695685 loss: 0.5012 (0.5052) acc1: 100.0000 (99.8721) acc5: 100.0000 (99.9980) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [19] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3354.898029070276 loss: 0.5026 (0.5053) acc1: 100.0000 (99.8695) acc5: 100.0000 (99.9981) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [19] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3429.082752101377 loss: 0.5030 (0.5054) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9981) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [19] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3449.2853187019346 loss: 0.5028 (0.5055) acc1: 100.0000 (99.8608) acc5: 100.0000 (99.9981) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [19] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3384.3156428278753 loss: 0.5021 (0.5055) acc1: 100.0000 (99.8604) acc5: 100.0000 (99.9982) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [19] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3651.8849617718283 loss: 0.5021 (0.5054) acc1: 100.0000 (99.8636) acc5: 100.0000 (99.9982) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [19] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3311.217747953891 loss: 0.5016 (0.5054) acc1: 100.0000 (99.8649) acc5: 100.0000 (99.9983) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [19] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3503.921890092677 loss: 0.5015 (0.5053) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9983) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [19] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1225 data: 0.1034 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [20] [ 0/468] eta: 0:01:14 lr: 0.0 img/s: 2341.7148440226115 loss: 0.5010 (0.5010) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1601 data: 0.1053 max mem: 190
Epoch: [20] [ 10/468] eta: 0:00:21 lr: 0.0 img/s: 3526.8248448021022 loss: 0.5020 (0.5086) acc1: 100.0000 (99.7869) acc5: 100.0000 (99.9290) time: 0.0480 data: 0.0099 max mem: 190
Epoch: [20] [ 20/468] eta: 0:00:19 lr: 0.0 img/s: 3593.922414197064 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8512) acc5: 100.0000 (99.9628) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [20] [ 30/468] eta: 0:00:17 lr: 0.0 img/s: 3534.9059567939844 loss: 0.5012 (0.5058) acc1: 100.0000 (99.8488) acc5: 100.0000 (99.9748) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [20] [ 40/468] eta: 0:00:17 lr: 0.0 img/s: 3485.608907644863 loss: 0.5011 (0.5047) acc1: 100.0000 (99.8857) acc5: 100.0000 (99.9809) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [20] [ 50/468] eta: 0:00:16 lr: 0.0 img/s: 3497.3936653941864 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8621) acc5: 100.0000 (99.9847) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [20] [ 60/468] eta: 0:00:15 lr: 0.0 img/s: 3581.527098065377 loss: 0.5021 (0.5058) acc1: 100.0000 (99.8591) acc5: 100.0000 (99.9872) time: 0.0377 data: 0.0002 max mem: 190
Epoch: [20] [ 70/468] eta: 0:00:15 lr: 0.0 img/s: 3528.6329142211152 loss: 0.5021 (0.5062) acc1: 100.0000 (99.8349) acc5: 100.0000 (99.9890) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [20] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 2350.4599690908058 loss: 0.5021 (0.5064) acc1: 100.0000 (99.8167) acc5: 100.0000 (99.9904) time: 0.0438 data: 0.0006 max mem: 190
Epoch: [20] [ 90/468] eta: 0:00:16 lr: 0.0 img/s: 2334.2924002034842 loss: 0.5032 (0.5071) acc1: 100.0000 (99.7940) acc5: 100.0000 (99.9914) time: 0.0559 data: 0.0035 max mem: 190
Epoch: [20] [100/468] eta: 0:00:16 lr: 0.0 img/s: 2650.677699823739 loss: 0.5027 (0.5071) acc1: 100.0000 (99.7989) acc5: 100.0000 (99.9923) time: 0.0599 data: 0.0053 max mem: 190
Epoch: [20] [110/468] eta: 0:00:16 lr: 0.0 img/s: 3581.6704604587244 loss: 0.5015 (0.5067) acc1: 100.0000 (99.8170) acc5: 100.0000 (99.9930) time: 0.0547 data: 0.0049 max mem: 190
Epoch: [20] [120/468] eta: 0:00:15 lr: 0.0 img/s: 3442.914752941931 loss: 0.5014 (0.5064) acc1: 100.0000 (99.8192) acc5: 100.0000 (99.9935) time: 0.0442 data: 0.0027 max mem: 190
Epoch: [20] [130/468] eta: 0:00:14 lr: 0.0 img/s: 3491.4345767649966 loss: 0.5014 (0.5060) acc1: 100.0000 (99.8330) acc5: 100.0000 (99.9940) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [20] [140/468] eta: 0:00:14 lr: 0.0 img/s: 3387.8607929626614 loss: 0.5012 (0.5059) acc1: 100.0000 (99.8338) acc5: 100.0000 (99.9945) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [20] [150/468] eta: 0:00:13 lr: 0.0 img/s: 3553.7184804697067 loss: 0.5013 (0.5059) acc1: 100.0000 (99.8344) acc5: 100.0000 (99.9948) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [20] [160/468] eta: 0:00:13 lr: 0.0 img/s: 3362.335988776993 loss: 0.5016 (0.5056) acc1: 100.0000 (99.8447) acc5: 100.0000 (99.9951) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [20] [170/468] eta: 0:00:12 lr: 0.0 img/s: 3438.6587415453987 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8492) acc5: 100.0000 (99.9954) time: 0.0374 data: 0.0002 max mem: 190
Epoch: [20] [180/468] eta: 0:00:12 lr: 0.0 img/s: 3528.2155029080272 loss: 0.5015 (0.5054) acc1: 100.0000 (99.8532) acc5: 100.0000 (99.9957) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [20] [190/468] eta: 0:00:11 lr: 0.0 img/s: 1953.906248180283 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8527) acc5: 100.0000 (99.9959) time: 0.0404 data: 0.0003 max mem: 190
Epoch: [20] [200/468] eta: 0:00:11 lr: 0.0 img/s: 3480.118443228667 loss: 0.5014 (0.5053) acc1: 100.0000 (99.8562) acc5: 100.0000 (99.9961) time: 0.0413 data: 0.0003 max mem: 190
Epoch: [20] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3652.009169631378 loss: 0.5014 (0.5051) acc1: 100.0000 (99.8593) acc5: 100.0000 (99.9963) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [20] [220/468] eta: 0:00:10 lr: 0.0 img/s: 3518.8959152640136 loss: 0.5023 (0.5051) acc1: 100.0000 (99.8621) acc5: 100.0000 (99.9965) time: 0.0387 data: 0.0003 max mem: 190
Epoch: [20] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3478.3597372137924 loss: 0.5024 (0.5052) acc1: 100.0000 (99.8647) acc5: 100.0000 (99.9966) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [20] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3593.104613263551 loss: 0.5015 (0.5052) acc1: 100.0000 (99.8638) acc5: 100.0000 (99.9968) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [20] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3502.0705148694396 loss: 0.5011 (0.5050) acc1: 100.0000 (99.8693) acc5: 100.0000 (99.9969) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [20] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3548.550904536231 loss: 0.5014 (0.5049) acc1: 100.0000 (99.8713) acc5: 100.0000 (99.9970) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [20] [270/468] eta: 0:00:08 lr: 0.0 img/s: 2910.8002667519695 loss: 0.5018 (0.5049) acc1: 100.0000 (99.8732) acc5: 100.0000 (99.9971) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [20] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3321.871535791408 loss: 0.5018 (0.5048) acc1: 100.0000 (99.8777) acc5: 100.0000 (99.9972) time: 0.0391 data: 0.0003 max mem: 190
Epoch: [20] [290/468] eta: 0:00:07 lr: 0.0 img/s: 3177.1833563147648 loss: 0.5011 (0.5048) acc1: 100.0000 (99.8792) acc5: 100.0000 (99.9973) time: 0.0402 data: 0.0003 max mem: 190
Epoch: [20] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3421.7175926220993 loss: 0.5013 (0.5047) acc1: 100.0000 (99.8832) acc5: 100.0000 (99.9974) time: 0.0398 data: 0.0003 max mem: 190
Epoch: [20] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3331.3326797304508 loss: 0.5014 (0.5048) acc1: 100.0000 (99.8819) acc5: 100.0000 (99.9975) time: 0.0390 data: 0.0003 max mem: 190
Epoch: [20] [320/468] eta: 0:00:06 lr: 0.0 img/s: 3345.3444414673204 loss: 0.5013 (0.5048) acc1: 100.0000 (99.8807) acc5: 100.0000 (99.9976) time: 0.0390 data: 0.0003 max mem: 190
Epoch: [20] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3488.5759808699495 loss: 0.5024 (0.5050) acc1: 100.0000 (99.8749) acc5: 100.0000 (99.9976) time: 0.0384 data: 0.0002 max mem: 190
Epoch: [20] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3523.722996344161 loss: 0.5023 (0.5049) acc1: 100.0000 (99.8740) acc5: 100.0000 (99.9977) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [20] [350/468] eta: 0:00:04 lr: 0.0 img/s: 2873.1184416140427 loss: 0.5014 (0.5048) acc1: 100.0000 (99.8776) acc5: 100.0000 (99.9978) time: 0.0456 data: 0.0013 max mem: 190
Epoch: [20] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3543.959706645367 loss: 0.5016 (0.5051) acc1: 100.0000 (99.8702) acc5: 100.0000 (99.9978) time: 0.0500 data: 0.0031 max mem: 190
Epoch: [20] [370/468] eta: 0:00:04 lr: 0.0 img/s: 3406.02263614678 loss: 0.5031 (0.5052) acc1: 100.0000 (99.8652) acc5: 100.0000 (99.9979) time: 0.0420 data: 0.0021 max mem: 190
Epoch: [20] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3502.6417182076775 loss: 0.5021 (0.5053) acc1: 100.0000 (99.8647) acc5: 100.0000 (99.9979) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [20] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3555.130432479323 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8641) acc5: 100.0000 (99.9980) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [20] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3579.879255046043 loss: 0.5021 (0.5053) acc1: 100.0000 (99.8656) acc5: 100.0000 (99.9981) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [20] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3541.8321150547567 loss: 0.5021 (0.5055) acc1: 100.0000 (99.8593) acc5: 100.0000 (99.9981) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [20] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3626.5015232266737 loss: 0.5020 (0.5055) acc1: 100.0000 (99.8608) acc5: 100.0000 (99.9981) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [20] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3511.5077736135368 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8622) acc5: 100.0000 (99.9982) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [20] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3442.4290798104607 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8636) acc5: 100.0000 (99.9982) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [20] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3436.281727642797 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8649) acc5: 100.0000 (99.9983) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [20] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3585.8808693677447 loss: 0.5015 (0.5053) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9983) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [20] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1171 data: 0.1000 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [21] [ 0/468] eta: 0:01:14 lr: 0.0 img/s: 2728.0719123961485 loss: 0.5023 (0.5023) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1582 data: 0.1113 max mem: 190
Epoch: [21] [ 10/468] eta: 0:00:21 lr: 0.0 img/s: 3509.7630961331024 loss: 0.5019 (0.5035) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0480 data: 0.0104 max mem: 190
Epoch: [21] [ 20/468] eta: 0:00:19 lr: 0.0 img/s: 3548.363275854092 loss: 0.5011 (0.5025) acc1: 100.0000 (99.9256) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [21] [ 30/468] eta: 0:00:17 lr: 0.0 img/s: 3536.9320245075432 loss: 0.5011 (0.5033) acc1: 100.0000 (99.8992) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [21] [ 40/468] eta: 0:00:17 lr: 0.0 img/s: 3387.1127037803462 loss: 0.5015 (0.5049) acc1: 100.0000 (99.8476) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [21] [ 50/468] eta: 0:00:16 lr: 0.0 img/s: 3461.0038164002062 loss: 0.5020 (0.5049) acc1: 100.0000 (99.8621) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [21] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3559.3258328637253 loss: 0.5020 (0.5063) acc1: 100.0000 (99.8335) acc5: 100.0000 (100.0000) time: 0.0385 data: 0.0003 max mem: 190
Epoch: [21] [ 70/468] eta: 0:00:15 lr: 0.0 img/s: 3587.1507166004076 loss: 0.5018 (0.5066) acc1: 100.0000 (99.8239) acc5: 100.0000 (100.0000) time: 0.0382 data: 0.0002 max mem: 190
Epoch: [21] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3504.219206694211 loss: 0.5022 (0.5073) acc1: 100.0000 (99.7975) acc5: 100.0000 (99.9904) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [21] [ 90/468] eta: 0:00:14 lr: 0.0 img/s: 3514.7031882160395 loss: 0.5023 (0.5071) acc1: 100.0000 (99.8111) acc5: 100.0000 (99.9914) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [21] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3495.3898720002085 loss: 0.5015 (0.5065) acc1: 100.0000 (99.8298) acc5: 100.0000 (99.9923) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [21] [110/468] eta: 0:00:13 lr: 0.0 img/s: 3589.837127974698 loss: 0.5015 (0.5065) acc1: 100.0000 (99.8311) acc5: 100.0000 (99.9930) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [21] [120/468] eta: 0:00:13 lr: 0.0 img/s: 3519.795658530509 loss: 0.5017 (0.5061) acc1: 100.0000 (99.8386) acc5: 100.0000 (99.9935) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [21] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3226.63977354 loss: 0.5019 (0.5060) acc1: 100.0000 (99.8449) acc5: 100.0000 (99.9940) time: 0.0443 data: 0.0013 max mem: 190
Epoch: [21] [140/468] eta: 0:00:13 lr: 0.0 img/s: 3418.7324851309872 loss: 0.5012 (0.5058) acc1: 100.0000 (99.8504) acc5: 100.0000 (99.9945) time: 0.0494 data: 0.0030 max mem: 190
Epoch: [21] [150/468] eta: 0:00:12 lr: 0.0 img/s: 3301.728209196632 loss: 0.5011 (0.5056) acc1: 100.0000 (99.8551) acc5: 100.0000 (99.9948) time: 0.0424 data: 0.0021 max mem: 190
Epoch: [21] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3516.1994432982938 loss: 0.5016 (0.5054) acc1: 100.0000 (99.8641) acc5: 100.0000 (99.9951) time: 0.0375 data: 0.0004 max mem: 190
Epoch: [21] [170/468] eta: 0:00:11 lr: 0.0 img/s: 3412.6910930865265 loss: 0.5020 (0.5054) acc1: 100.0000 (99.8675) acc5: 100.0000 (99.9954) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [21] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3458.840925929505 loss: 0.5017 (0.5052) acc1: 100.0000 (99.8748) acc5: 100.0000 (99.9957) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [21] [190/468] eta: 0:00:10 lr: 0.0 img/s: 3420.583945512351 loss: 0.5025 (0.5056) acc1: 100.0000 (99.8609) acc5: 100.0000 (99.9959) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [21] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3542.0657913835194 loss: 0.5025 (0.5056) acc1: 100.0000 (99.8640) acc5: 100.0000 (99.9961) time: 0.0401 data: 0.0007 max mem: 190
Epoch: [21] [210/468] eta: 0:00:10 lr: 0.0 img/s: 2264.1699083992644 loss: 0.5019 (0.5054) acc1: 100.0000 (99.8704) acc5: 100.0000 (99.9963) time: 0.0423 data: 0.0006 max mem: 190
Epoch: [21] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3493.297450646123 loss: 0.5017 (0.5056) acc1: 100.0000 (99.8657) acc5: 100.0000 (99.9965) time: 0.0395 data: 0.0003 max mem: 190
Epoch: [21] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3334.0221079563803 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8647) acc5: 100.0000 (99.9966) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [21] [240/468] eta: 0:00:08 lr: 0.0 img/s: 3511.278111694648 loss: 0.5009 (0.5056) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9968) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [21] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3417.8183855360326 loss: 0.5010 (0.5054) acc1: 100.0000 (99.8724) acc5: 100.0000 (99.9969) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [21] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3480.8630466495933 loss: 0.5011 (0.5052) acc1: 100.0000 (99.8773) acc5: 100.0000 (99.9970) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [21] [270/468] eta: 0:00:07 lr: 0.0 img/s: 3405.2881046315442 loss: 0.5010 (0.5053) acc1: 100.0000 (99.8760) acc5: 100.0000 (99.9971) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [21] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3434.2814229147875 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8721) acc5: 100.0000 (99.9972) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [21] [290/468] eta: 0:00:06 lr: 0.0 img/s: 3563.2236808920156 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8658) acc5: 100.0000 (99.9973) time: 0.0376 data: 0.0002 max mem: 190
Epoch: [21] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3592.4794871623294 loss: 0.5019 (0.5055) acc1: 100.0000 (99.8650) acc5: 100.0000 (99.9974) time: 0.0386 data: 0.0002 max mem: 190
Epoch: [21] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3356.597030229141 loss: 0.5019 (0.5054) acc1: 100.0000 (99.8694) acc5: 100.0000 (99.9975) time: 0.0393 data: 0.0003 max mem: 190
Epoch: [21] [320/468] eta: 0:00:05 lr: 0.0 img/s: 3536.978628086542 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8686) acc5: 100.0000 (99.9976) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [21] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3451.014739440377 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8702) acc5: 100.0000 (99.9976) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [21] [340/468] eta: 0:00:04 lr: 0.0 img/s: 3516.4527817441085 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9977) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [21] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3412.734480084417 loss: 0.5018 (0.5055) acc1: 100.0000 (99.8665) acc5: 100.0000 (99.9978) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [21] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3465.7851342104245 loss: 0.5018 (0.5056) acc1: 100.0000 (99.8658) acc5: 100.0000 (99.9978) time: 0.0376 data: 0.0003 max mem: 190
Epoch: [21] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3468.4272165800967 loss: 0.5020 (0.5056) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9979) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [21] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3363.4102780962403 loss: 0.5013 (0.5055) acc1: 100.0000 (99.8647) acc5: 100.0000 (99.9979) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [21] [390/468] eta: 0:00:03 lr: 0.0 img/s: 2199.209044732099 loss: 0.5015 (0.5056) acc1: 100.0000 (99.8581) acc5: 100.0000 (99.9980) time: 0.0394 data: 0.0003 max mem: 190
Epoch: [21] [400/468] eta: 0:00:02 lr: 0.0 img/s: 2553.67047351773 loss: 0.5015 (0.5055) acc1: 100.0000 (99.8617) acc5: 100.0000 (99.9981) time: 0.0502 data: 0.0012 max mem: 190
Epoch: [21] [410/468] eta: 0:00:02 lr: 0.0 img/s: 1813.054809971835 loss: 0.5012 (0.5056) acc1: 100.0000 (99.8612) acc5: 100.0000 (99.9981) time: 0.0729 data: 0.0018 max mem: 190
Epoch: [21] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3504.1505906925136 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8645) acc5: 100.0000 (99.9981) time: 0.0660 data: 0.0021 max mem: 190
Epoch: [21] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3498.3052402486546 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8641) acc5: 100.0000 (99.9982) time: 0.0414 data: 0.0015 max mem: 190
Epoch: [21] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3455.47932650224 loss: 0.5026 (0.5055) acc1: 100.0000 (99.8654) acc5: 100.0000 (99.9982) time: 0.0375 data: 0.0003 max mem: 190
Epoch: [21] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3469.52553654864 loss: 0.5017 (0.5055) acc1: 100.0000 (99.8649) acc5: 100.0000 (99.9983) time: 0.0379 data: 0.0002 max mem: 190
Epoch: [21] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3408.40123417601 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8644) acc5: 100.0000 (99.9983) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [21] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1211 data: 0.1003 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [22] [ 0/468] eta: 0:01:15 lr: 0.0 img/s: 2650.978737692452 loss: 0.5289 (0.5289) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.1605 data: 0.1121 max mem: 190
Epoch: [22] [ 10/468] eta: 0:00:21 lr: 0.0 img/s: 3643.780071807193 loss: 0.5014 (0.5080) acc1: 100.0000 (99.7869) acc5: 100.0000 (100.0000) time: 0.0478 data: 0.0105 max mem: 190
Epoch: [22] [ 20/468] eta: 0:00:19 lr: 0.0 img/s: 3613.248478974856 loss: 0.5017 (0.5065) acc1: 100.0000 (99.8140) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [22] [ 30/468] eta: 0:00:18 lr: 0.0 img/s: 3002.331487881533 loss: 0.5025 (0.5070) acc1: 100.0000 (99.7984) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0002 max mem: 190
Epoch: [22] [ 40/468] eta: 0:00:17 lr: 0.0 img/s: 3119.275085118003 loss: 0.5025 (0.5063) acc1: 100.0000 (99.8285) acc5: 100.0000 (100.0000) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [22] [ 50/468] eta: 0:00:17 lr: 0.0 img/s: 3240.760777969601 loss: 0.5020 (0.5059) acc1: 100.0000 (99.8315) acc5: 100.0000 (100.0000) time: 0.0409 data: 0.0003 max mem: 190
Epoch: [22] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3273.243863478399 loss: 0.5018 (0.5053) acc1: 100.0000 (99.8591) acc5: 100.0000 (100.0000) time: 0.0399 data: 0.0003 max mem: 190
Epoch: [22] [ 70/468] eta: 0:00:16 lr: 0.0 img/s: 3546.089855876564 loss: 0.5017 (0.5057) acc1: 100.0000 (99.8460) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [22] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3495.6857423769866 loss: 0.5015 (0.5052) acc1: 100.0000 (99.8553) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [22] [ 90/468] eta: 0:00:15 lr: 0.0 img/s: 3509.327197615437 loss: 0.5012 (0.5051) acc1: 100.0000 (99.8626) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [22] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3441.8111485078693 loss: 0.5012 (0.5050) acc1: 100.0000 (99.8685) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [22] [110/468] eta: 0:00:14 lr: 0.0 img/s: 3550.592648439877 loss: 0.5015 (0.5047) acc1: 100.0000 (99.8733) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [22] [120/468] eta: 0:00:13 lr: 0.0 img/s: 3530.7678931965406 loss: 0.5019 (0.5047) acc1: 100.0000 (99.8773) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [22] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3508.9831436806776 loss: 0.5032 (0.5050) acc1: 100.0000 (99.8628) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [22] [140/468] eta: 0:00:12 lr: 0.0 img/s: 3529.468033212588 loss: 0.5011 (0.5050) acc1: 100.0000 (99.8615) acc5: 100.0000 (100.0000) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [22] [150/468] eta: 0:00:12 lr: 0.0 img/s: 2792.9587614385377 loss: 0.5011 (0.5051) acc1: 100.0000 (99.8603) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [22] [160/468] eta: 0:00:12 lr: 0.0 img/s: 2942.4194586181006 loss: 0.5013 (0.5049) acc1: 100.0000 (99.8690) acc5: 100.0000 (100.0000) time: 0.0412 data: 0.0003 max mem: 190
Epoch: [22] [170/468] eta: 0:00:11 lr: 0.0 img/s: 2722.703843637636 loss: 0.5013 (0.5051) acc1: 100.0000 (99.8675) acc5: 100.0000 (100.0000) time: 0.0481 data: 0.0007 max mem: 190
Epoch: [22] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3555.0362674400894 loss: 0.5030 (0.5052) acc1: 100.0000 (99.8576) acc5: 100.0000 (100.0000) time: 0.0497 data: 0.0020 max mem: 190
Epoch: [22] [190/468] eta: 0:00:11 lr: 0.0 img/s: 3453.900964365442 loss: 0.5035 (0.5053) acc1: 100.0000 (99.8527) acc5: 100.0000 (100.0000) time: 0.0437 data: 0.0016 max mem: 190
Epoch: [22] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3532.4173070849565 loss: 0.5012 (0.5051) acc1: 100.0000 (99.8601) acc5: 100.0000 (100.0000) time: 0.0420 data: 0.0007 max mem: 190
Epoch: [22] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3577.0647157981703 loss: 0.5012 (0.5052) acc1: 100.0000 (99.8593) acc5: 100.0000 (100.0000) time: 0.0391 data: 0.0007 max mem: 190
Epoch: [22] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3581.6704604587244 loss: 0.5013 (0.5052) acc1: 100.0000 (99.8621) acc5: 100.0000 (100.0000) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [22] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3663.3725597232365 loss: 0.5018 (0.5053) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0360 data: 0.0002 max mem: 190
Epoch: [22] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3384.7210369697887 loss: 0.5011 (0.5051) acc1: 100.0000 (99.8638) acc5: 100.0000 (100.0000) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [22] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3542.907281534177 loss: 0.5011 (0.5053) acc1: 100.0000 (99.8599) acc5: 100.0000 (99.9969) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [22] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3555.4835958092162 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8623) acc5: 100.0000 (99.9970) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [22] [270/468] eta: 0:00:07 lr: 0.0 img/s: 3492.0704566150644 loss: 0.5015 (0.5053) acc1: 100.0000 (99.8645) acc5: 100.0000 (99.9971) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [22] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3521.4348345118 loss: 0.5016 (0.5054) acc1: 100.0000 (99.8638) acc5: 100.0000 (99.9972) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [22] [290/468] eta: 0:00:06 lr: 0.0 img/s: 3505.0428083645074 loss: 0.5013 (0.5054) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9973) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [22] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3506.233138931158 loss: 0.5011 (0.5053) acc1: 100.0000 (99.8676) acc5: 100.0000 (99.9974) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [22] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3525.0417722682564 loss: 0.5024 (0.5054) acc1: 100.0000 (99.8618) acc5: 100.0000 (99.9975) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [22] [320/468] eta: 0:00:05 lr: 0.0 img/s: 3452.812512862729 loss: 0.5018 (0.5053) acc1: 100.0000 (99.8637) acc5: 100.0000 (99.9976) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [22] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3561.8052942347244 loss: 0.5023 (0.5053) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9976) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [22] [340/468] eta: 0:00:04 lr: 0.0 img/s: 3574.278394716519 loss: 0.5026 (0.5055) acc1: 100.0000 (99.8580) acc5: 100.0000 (99.9977) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [22] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3384.678359328702 loss: 0.5016 (0.5055) acc1: 100.0000 (99.8598) acc5: 100.0000 (99.9978) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [22] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3586.3839088291684 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8637) acc5: 100.0000 (99.9978) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [22] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3521.6196261069204 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8652) acc5: 100.0000 (99.9979) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [22] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3221.6056214633327 loss: 0.5016 (0.5054) acc1: 100.0000 (99.8626) acc5: 100.0000 (99.9979) time: 0.0375 data: 0.0002 max mem: 190
Epoch: [22] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3571.116305367275 loss: 0.5020 (0.5054) acc1: 100.0000 (99.8641) acc5: 100.0000 (99.9980) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [22] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3317.9503609215863 loss: 0.5021 (0.5053) acc1: 100.0000 (99.8675) acc5: 100.0000 (99.9981) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [22] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3300.0842861006613 loss: 0.5026 (0.5054) acc1: 100.0000 (99.8669) acc5: 100.0000 (99.9981) time: 0.0388 data: 0.0003 max mem: 190
Epoch: [22] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3266.949700001217 loss: 0.5020 (0.5054) acc1: 100.0000 (99.8664) acc5: 100.0000 (99.9981) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [22] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3328.420585372507 loss: 0.5020 (0.5054) acc1: 100.0000 (99.8695) acc5: 100.0000 (99.9982) time: 0.0382 data: 0.0002 max mem: 190
Epoch: [22] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3564.5721949632502 loss: 0.5018 (0.5053) acc1: 100.0000 (99.8707) acc5: 100.0000 (99.9982) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [22] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3527.3114504218024 loss: 0.5012 (0.5053) acc1: 100.0000 (99.8718) acc5: 100.0000 (99.9983) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [22] [460/468] eta: 0:00:00 lr: 0.0 img/s: 2299.469375867327 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8712) acc5: 100.0000 (99.9983) time: 0.0474 data: 0.0027 max mem: 190
Epoch: [22] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:09 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1234 data: 0.1042 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [23] [ 0/468] eta: 0:01:14 lr: 0.0 img/s: 2733.669966190069 loss: 0.5012 (0.5012) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.1601 data: 0.1132 max mem: 190
Epoch: [23] [ 10/468] eta: 0:00:22 lr: 0.0 img/s: 3446.274060712658 loss: 0.5017 (0.5046) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0492 data: 0.0106 max mem: 190
Epoch: [23] [ 20/468] eta: 0:00:19 lr: 0.0 img/s: 3491.5708172370287 loss: 0.5015 (0.5045) acc1: 100.0000 (99.8512) acc5: 100.0000 (100.0000) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [23] [ 30/468] eta: 0:00:18 lr: 0.0 img/s: 3525.875192098039 loss: 0.5010 (0.5035) acc1: 100.0000 (99.8992) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [23] [ 40/468] eta: 0:00:17 lr: 0.0 img/s: 3545.949327626747 loss: 0.5010 (0.5039) acc1: 100.0000 (99.9047) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [23] [ 50/468] eta: 0:00:16 lr: 0.0 img/s: 3406.779102602339 loss: 0.5022 (0.5054) acc1: 100.0000 (99.8468) acc5: 100.0000 (100.0000) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [23] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3442.517373824165 loss: 0.5016 (0.5047) acc1: 100.0000 (99.8719) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [23] [ 70/468] eta: 0:00:15 lr: 0.0 img/s: 3544.2638569806436 loss: 0.5016 (0.5044) acc1: 100.0000 (99.8900) acc5: 100.0000 (100.0000) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [23] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3308.952418519797 loss: 0.5017 (0.5041) acc1: 100.0000 (99.9035) acc5: 100.0000 (100.0000) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [23] [ 90/468] eta: 0:00:14 lr: 0.0 img/s: 3444.7925056143727 loss: 0.5013 (0.5038) acc1: 100.0000 (99.9141) acc5: 100.0000 (100.0000) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [23] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3346.699946390056 loss: 0.5014 (0.5042) acc1: 100.0000 (99.8994) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [23] [110/468] eta: 0:00:13 lr: 0.0 img/s: 3565.8269925611053 loss: 0.5017 (0.5043) acc1: 100.0000 (99.9015) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [23] [120/468] eta: 0:00:13 lr: 0.0 img/s: 3452.7014849543066 loss: 0.5016 (0.5047) acc1: 100.0000 (99.8902) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [23] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3512.19693966335 loss: 0.5024 (0.5048) acc1: 100.0000 (99.8867) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [23] [140/468] eta: 0:00:12 lr: 0.0 img/s: 3590.485347029948 loss: 0.5020 (0.5048) acc1: 100.0000 (99.8892) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [23] [150/468] eta: 0:00:12 lr: 0.0 img/s: 3541.8087491176334 loss: 0.5026 (0.5050) acc1: 100.0000 (99.8655) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [23] [160/468] eta: 0:00:11 lr: 0.0 img/s: 3459.130640962862 loss: 0.5020 (0.5050) acc1: 100.0000 (99.8690) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [23] [170/468] eta: 0:00:11 lr: 0.0 img/s: 3506.5079454237884 loss: 0.5016 (0.5050) acc1: 100.0000 (99.8721) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [23] [180/468] eta: 0:00:11 lr: 0.0 img/s: 2821.315311811909 loss: 0.5016 (0.5049) acc1: 100.0000 (99.8748) acc5: 100.0000 (100.0000) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [23] [190/468] eta: 0:00:10 lr: 0.0 img/s: 2042.9344353371844 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8609) acc5: 100.0000 (100.0000) time: 0.0486 data: 0.0024 max mem: 190
Epoch: [23] [200/468] eta: 0:00:10 lr: 0.0 img/s: 2757.029867302083 loss: 0.5019 (0.5053) acc1: 100.0000 (99.8562) acc5: 100.0000 (100.0000) time: 0.0597 data: 0.0042 max mem: 190
Epoch: [23] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3406.9088163063275 loss: 0.5013 (0.5052) acc1: 100.0000 (99.8593) acc5: 100.0000 (100.0000) time: 0.0557 data: 0.0046 max mem: 190
Epoch: [23] [220/468] eta: 0:00:10 lr: 0.0 img/s: 2642.445376331384 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8586) acc5: 100.0000 (100.0000) time: 0.0512 data: 0.0032 max mem: 190
Epoch: [23] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3511.8523228279496 loss: 0.5018 (0.5054) acc1: 100.0000 (99.8580) acc5: 100.0000 (100.0000) time: 0.0497 data: 0.0029 max mem: 190
Epoch: [23] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3404.683434166635 loss: 0.5018 (0.5052) acc1: 100.0000 (99.8638) acc5: 100.0000 (100.0000) time: 0.0423 data: 0.0024 max mem: 190
Epoch: [23] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3574.5639714498775 loss: 0.5021 (0.5055) acc1: 100.0000 (99.8537) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [23] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3545.7151386265473 loss: 0.5029 (0.5056) acc1: 100.0000 (99.8473) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0002 max mem: 190
Epoch: [23] [270/468] eta: 0:00:08 lr: 0.0 img/s: 3458.840925929505 loss: 0.5019 (0.5058) acc1: 100.0000 (99.8443) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [23] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3398.4548947618296 loss: 0.5020 (0.5057) acc1: 100.0000 (99.8471) acc5: 100.0000 (100.0000) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [23] [290/468] eta: 0:00:07 lr: 0.0 img/s: 3445.367286169011 loss: 0.5016 (0.5056) acc1: 100.0000 (99.8523) acc5: 100.0000 (100.0000) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [23] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3603.692572057619 loss: 0.5012 (0.5056) acc1: 100.0000 (99.8521) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [23] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3618.4356242122785 loss: 0.5012 (0.5056) acc1: 100.0000 (99.8543) acc5: 100.0000 (99.9975) time: 0.0364 data: 0.0002 max mem: 190
Epoch: [23] [320/468] eta: 0:00:05 lr: 0.0 img/s: 3546.9098260473165 loss: 0.5012 (0.5055) acc1: 100.0000 (99.8588) acc5: 100.0000 (99.9976) time: 0.0362 data: 0.0002 max mem: 190
Epoch: [23] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3492.774736677748 loss: 0.5013 (0.5054) acc1: 100.0000 (99.8607) acc5: 100.0000 (99.9976) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [23] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3468.15834625323 loss: 0.5015 (0.5055) acc1: 100.0000 (99.8602) acc5: 100.0000 (99.9977) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [23] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3601.323566502992 loss: 0.5015 (0.5054) acc1: 100.0000 (99.8642) acc5: 100.0000 (99.9978) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [23] [360/468] eta: 0:00:04 lr: 0.0 img/s: 2893.6212487064504 loss: 0.5011 (0.5053) acc1: 100.0000 (99.8680) acc5: 100.0000 (99.9978) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [23] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3375.824741879095 loss: 0.5014 (0.5053) acc1: 100.0000 (99.8694) acc5: 100.0000 (99.9979) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [23] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3237.5166558120463 loss: 0.5017 (0.5052) acc1: 100.0000 (99.8708) acc5: 100.0000 (99.9979) time: 0.0395 data: 0.0003 max mem: 190
Epoch: [23] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3385.7456233287926 loss: 0.5013 (0.5054) acc1: 100.0000 (99.8621) acc5: 100.0000 (99.9980) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [23] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3163.553884682243 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8597) acc5: 100.0000 (99.9981) time: 0.0394 data: 0.0003 max mem: 190
Epoch: [23] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3531.023335350293 loss: 0.5014 (0.5054) acc1: 100.0000 (99.8631) acc5: 100.0000 (99.9981) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [23] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3491.4345767649966 loss: 0.5012 (0.5054) acc1: 100.0000 (99.8645) acc5: 100.0000 (99.9981) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [23] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3502.344669219579 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8677) acc5: 100.0000 (99.9982) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [23] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3352.5516242241065 loss: 0.5015 (0.5053) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9982) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [23] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3509.4418972538715 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8649) acc5: 100.0000 (99.9983) time: 0.0378 data: 0.0003 max mem: 190
Epoch: [23] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3522.7056685235857 loss: 0.5021 (0.5054) acc1: 100.0000 (99.8661) acc5: 100.0000 (99.9983) time: 0.0377 data: 0.0003 max mem: 190
Epoch: [23] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:10 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1268 data: 0.1103 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [24] [ 0/468] eta: 0:01:47 lr: 0.0 img/s: 1711.2187061733432 loss: 0.5025 (0.5025) acc1: 100.0000 (100.0000) acc5: 100.0000 (100.0000) time: 0.2291 data: 0.1543 max mem: 190
Epoch: [24] [ 10/468] eta: 0:00:27 lr: 0.0 img/s: 3611.1340611148107 loss: 0.5018 (0.5043) acc1: 100.0000 (99.9290) acc5: 100.0000 (100.0000) time: 0.0607 data: 0.0158 max mem: 190
Epoch: [24] [ 20/468] eta: 0:00:22 lr: 0.0 img/s: 3511.41590524092 loss: 0.5014 (0.5039) acc1: 100.0000 (99.9256) acc5: 100.0000 (100.0000) time: 0.0407 data: 0.0011 max mem: 190
Epoch: [24] [ 30/468] eta: 0:00:19 lr: 0.0 img/s: 3481.3144765424895 loss: 0.5014 (0.5041) acc1: 100.0000 (99.9244) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [24] [ 40/468] eta: 0:00:18 lr: 0.0 img/s: 3515.209470431093 loss: 0.5016 (0.5042) acc1: 100.0000 (99.9238) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [24] [ 50/468] eta: 0:00:17 lr: 0.0 img/s: 3570.3563367449406 loss: 0.5016 (0.5043) acc1: 100.0000 (99.9234) acc5: 100.0000 (100.0000) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [24] [ 60/468] eta: 0:00:16 lr: 0.0 img/s: 3415.687386275433 loss: 0.5020 (0.5050) acc1: 100.0000 (99.9103) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [24] [ 70/468] eta: 0:00:16 lr: 0.0 img/s: 3545.2234424010303 loss: 0.5020 (0.5053) acc1: 100.0000 (99.9010) acc5: 100.0000 (100.0000) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [24] [ 80/468] eta: 0:00:15 lr: 0.0 img/s: 3526.755340673201 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8939) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0003 max mem: 190
Epoch: [24] [ 90/468] eta: 0:00:15 lr: 0.0 img/s: 3453.1012188454733 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8884) acc5: 100.0000 (100.0000) time: 0.0369 data: 0.0003 max mem: 190
Epoch: [24] [100/468] eta: 0:00:14 lr: 0.0 img/s: 3487.6014993146546 loss: 0.5016 (0.5054) acc1: 100.0000 (99.8840) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [24] [110/468] eta: 0:00:14 lr: 0.0 img/s: 2461.0849347220183 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8874) acc5: 100.0000 (100.0000) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [24] [120/468] eta: 0:00:14 lr: 0.0 img/s: 2650.651525848832 loss: 0.5020 (0.5053) acc1: 100.0000 (99.8838) acc5: 100.0000 (100.0000) time: 0.0487 data: 0.0037 max mem: 190
Epoch: [24] [130/468] eta: 0:00:13 lr: 0.0 img/s: 3563.412884469873 loss: 0.5020 (0.5059) acc1: 100.0000 (99.8628) acc5: 100.0000 (100.0000) time: 0.0480 data: 0.0042 max mem: 190
Epoch: [24] [140/468] eta: 0:00:13 lr: 0.0 img/s: 3567.5093329080532 loss: 0.5022 (0.5060) acc1: 100.0000 (99.8559) acc5: 100.0000 (100.0000) time: 0.0371 data: 0.0008 max mem: 190
Epoch: [24] [150/468] eta: 0:00:12 lr: 0.0 img/s: 3569.383099527957 loss: 0.5025 (0.5060) acc1: 100.0000 (99.8500) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [24] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3519.657206542761 loss: 0.5023 (0.5060) acc1: 100.0000 (99.8447) acc5: 100.0000 (100.0000) time: 0.0367 data: 0.0002 max mem: 190
Epoch: [24] [170/468] eta: 0:00:11 lr: 0.0 img/s: 3487.624155493192 loss: 0.5023 (0.5062) acc1: 100.0000 (99.8355) acc5: 100.0000 (100.0000) time: 0.0366 data: 0.0003 max mem: 190
Epoch: [24] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3531.464640684098 loss: 0.5028 (0.5061) acc1: 100.0000 (99.8403) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [24] [190/468] eta: 0:00:11 lr: 0.0 img/s: 3589.5011065275094 loss: 0.5028 (0.5063) acc1: 100.0000 (99.8282) acc5: 100.0000 (100.0000) time: 0.0368 data: 0.0002 max mem: 190
Epoch: [24] [200/468] eta: 0:00:10 lr: 0.0 img/s: 3495.2533333333336 loss: 0.5031 (0.5062) acc1: 100.0000 (99.8290) acc5: 100.0000 (100.0000) time: 0.0397 data: 0.0003 max mem: 190
Epoch: [24] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3428.9075441330506 loss: 0.5015 (0.5061) acc1: 100.0000 (99.8334) acc5: 100.0000 (100.0000) time: 0.0424 data: 0.0003 max mem: 190
Epoch: [24] [220/468] eta: 0:00:09 lr: 0.0 img/s: 3414.38400386675 loss: 0.5015 (0.5061) acc1: 100.0000 (99.8339) acc5: 100.0000 (100.0000) time: 0.0392 data: 0.0003 max mem: 190
Epoch: [24] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3399.0143147471654 loss: 0.5015 (0.5060) acc1: 100.0000 (99.8377) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [24] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3501.956296557212 loss: 0.5014 (0.5059) acc1: 100.0000 (99.8412) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [24] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3344.2399960133553 loss: 0.5023 (0.5060) acc1: 100.0000 (99.8413) acc5: 100.0000 (100.0000) time: 0.0381 data: 0.0003 max mem: 190
Epoch: [24] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3525.6899536362084 loss: 0.5022 (0.5059) acc1: 100.0000 (99.8443) acc5: 100.0000 (100.0000) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [24] [270/468] eta: 0:00:07 lr: 0.0 img/s: 3487.352300776886 loss: 0.5019 (0.5060) acc1: 100.0000 (99.8414) acc5: 100.0000 (100.0000) time: 0.0374 data: 0.0003 max mem: 190
Epoch: [24] [280/468] eta: 0:00:07 lr: 0.0 img/s: 2152.0459854892374 loss: 0.5021 (0.5059) acc1: 100.0000 (99.8443) acc5: 100.0000 (100.0000) time: 0.0529 data: 0.0012 max mem: 190
Epoch: [24] [290/468] eta: 0:00:07 lr: 0.0 img/s: 3302.459367522314 loss: 0.5020 (0.5060) acc1: 100.0000 (99.8443) acc5: 100.0000 (100.0000) time: 0.0637 data: 0.0030 max mem: 190
Epoch: [24] [300/468] eta: 0:00:06 lr: 0.0 img/s: 3530.41962254225 loss: 0.5014 (0.5058) acc1: 100.0000 (99.8495) acc5: 100.0000 (100.0000) time: 0.0480 data: 0.0020 max mem: 190
Epoch: [24] [310/468] eta: 0:00:06 lr: 0.0 img/s: 3457.0146105255026 loss: 0.5014 (0.5059) acc1: 100.0000 (99.8468) acc5: 100.0000 (99.9975) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [24] [320/468] eta: 0:00:06 lr: 0.0 img/s: 3494.411580543749 loss: 0.5013 (0.5058) acc1: 100.0000 (99.8515) acc5: 100.0000 (99.9976) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [24] [330/468] eta: 0:00:05 lr: 0.0 img/s: 3519.4956929894715 loss: 0.5011 (0.5058) acc1: 100.0000 (99.8513) acc5: 100.0000 (99.9976) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [24] [340/468] eta: 0:00:05 lr: 0.0 img/s: 3488.8480264098466 loss: 0.5015 (0.5058) acc1: 100.0000 (99.8534) acc5: 100.0000 (99.9977) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [24] [350/468] eta: 0:00:04 lr: 0.0 img/s: 3487.6014993146546 loss: 0.5016 (0.5057) acc1: 100.0000 (99.8531) acc5: 100.0000 (99.9978) time: 0.0373 data: 0.0002 max mem: 190
Epoch: [24] [360/468] eta: 0:00:04 lr: 0.0 img/s: 3494.88928236642 loss: 0.5014 (0.5056) acc1: 100.0000 (99.8572) acc5: 100.0000 (99.9978) time: 0.0371 data: 0.0003 max mem: 190
Epoch: [24] [370/468] eta: 0:00:03 lr: 0.0 img/s: 3483.6185915529513 loss: 0.5014 (0.5055) acc1: 100.0000 (99.8610) acc5: 100.0000 (99.9979) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [24] [380/468] eta: 0:00:03 lr: 0.0 img/s: 3589.2371338030994 loss: 0.5013 (0.5055) acc1: 100.0000 (99.8626) acc5: 100.0000 (99.9979) time: 0.0370 data: 0.0003 max mem: 190
Epoch: [24] [390/468] eta: 0:00:03 lr: 0.0 img/s: 3577.636805874866 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8641) acc5: 100.0000 (99.9980) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [24] [400/468] eta: 0:00:02 lr: 0.0 img/s: 3608.3186837559733 loss: 0.5017 (0.5054) acc1: 100.0000 (99.8636) acc5: 100.0000 (99.9981) time: 0.0369 data: 0.0002 max mem: 190
Epoch: [24] [410/468] eta: 0:00:02 lr: 0.0 img/s: 3593.03247222594 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8669) acc5: 100.0000 (99.9981) time: 0.0365 data: 0.0002 max mem: 190
Epoch: [24] [420/468] eta: 0:00:01 lr: 0.0 img/s: 3527.8909179321718 loss: 0.5011 (0.5052) acc1: 100.0000 (99.8701) acc5: 100.0000 (99.9981) time: 0.0371 data: 0.0002 max mem: 190
Epoch: [24] [430/468] eta: 0:00:01 lr: 0.0 img/s: 3469.6824961869556 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8695) acc5: 100.0000 (99.9982) time: 0.0372 data: 0.0002 max mem: 190
Epoch: [24] [440/468] eta: 0:00:01 lr: 0.0 img/s: 3280.2235731873475 loss: 0.5018 (0.5053) acc1: 100.0000 (99.8671) acc5: 100.0000 (99.9982) time: 0.0373 data: 0.0003 max mem: 190
Epoch: [24] [450/468] eta: 0:00:00 lr: 0.0 img/s: 3400.090640219381 loss: 0.5016 (0.5053) acc1: 100.0000 (99.8683) acc5: 100.0000 (99.9983) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [24] [460/468] eta: 0:00:00 lr: 0.0 img/s: 3525.0186273415493 loss: 0.5012 (0.5053) acc1: 100.0000 (99.8678) acc5: 100.0000 (99.9983) time: 0.0383 data: 0.0003 max mem: 190
Epoch: [24] Total time: 0:00:18
Test: [ 0/79] eta: 0:00:10 loss: 0.5526 (0.5526) acc1: 98.4375 (98.4375) acc5: 100.0000 (100.0000) time: 0.1272 data: 0.1089 max mem: 190
Test: Total time: 0:00:01
Test: Acc@1 98.760 Acc@5 99.900
Epoch: [25] [ 0/468] eta: 0:01:19 lr: 0.0 img/s: 2520.7810759796785 loss: 0.5107 (0.5107) acc1: 99.2188 (99.2188) acc5: 100.0000 (100.0000) time: 0.1689 data: 0.1181 max mem: 190
Epoch: [25] [ 10/468] eta: 0:00:23 lr: 0.0 img/s: 3186.9908166475716 loss: 0.5024 (0.5065) acc1: 100.0000 (99.7869) acc5: 100.0000 (100.0000) time: 0.0521 data: 0.0111 max mem: 190
Epoch: [25] [ 20/468] eta: 0:00:20 lr: 0.0 img/s: 3260.640089400675 loss: 0.5013 (0.5041) acc1: 100.0000 (99.8884) acc5: 100.0000 (100.0000) time: 0.0400 data: 0.0003 max mem: 190
Epoch: [25] [ 30/468] eta: 0:00:20 lr: 0.0 img/s: 2113.99792093243 loss: 0.5013 (0.5043) acc1: 100.0000 (99.8992) acc5: 100.0000 (100.0000) time: 0.0456 data: 0.0007 max mem: 190
Epoch: [25] [ 40/468] eta: 0:00:20 lr: 0.0 img/s: 3353.9967888847937 loss: 0.5018 (0.5048) acc1: 100.0000 (99.8476) acc5: 100.0000 (100.0000) time: 0.0491 data: 0.0017 max mem: 190
Epoch: [25] [ 50/468] eta: 0:00:19 lr: 0.0 img/s: 3436.0398087643284 loss: 0.5022 (0.5044) acc1: 100.0000 (99.8775) acc5: 100.0000 (100.0000) time: 0.0437 data: 0.0023 max mem: 190
Epoch: [25] [ 60/468] eta: 0:00:18 lr: 0.0 img/s: 3491.0713208136085 loss: 0.5016 (0.5045) acc1: 100.0000 (99.8847) acc5: 100.0000 (100.0000) time: 0.0389 data: 0.0012 max mem: 190
Epoch: [25] [ 70/468] eta: 0:00:17 lr: 0.0 img/s: 3572.019188417754 loss: 0.5019 (0.5045) acc1: 100.0000 (99.8900) acc5: 100.0000 (100.0000) time: 0.0380 data: 0.0003 max mem: 190
Epoch: [25] [ 80/468] eta: 0:00:16 lr: 0.0 img/s: 3435.6660011262993 loss: 0.5024 (0.5046) acc1: 100.0000 (99.8843) acc5: 100.0000 (100.0000) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [25] [ 90/468] eta: 0:00:16 lr: 0.0 img/s: 3535.301672593178 loss: 0.5018 (0.5043) acc1: 100.0000 (99.8970) acc5: 100.0000 (100.0000) time: 0.0372 data: 0.0003 max mem: 190
Epoch: [25] [100/468] eta: 0:00:15 lr: 0.0 img/s: 3394.9304852060527 loss: 0.5015 (0.5049) acc1: 100.0000 (99.8840) acc5: 100.0000 (100.0000) time: 0.0367 data: 0.0003 max mem: 190
Epoch: [25] [110/468] eta: 0:00:14 lr: 0.0 img/s: 3215.5082982457193 loss: 0.5013 (0.5053) acc1: 100.0000 (99.8733) acc5: 100.0000 (100.0000) time: 0.0391 data: 0.0003 max mem: 190
Epoch: [25] [120/468] eta: 0:00:14 lr: 0.0 img/s: 3210.969634985855 loss: 0.5017 (0.5053) acc1: 100.0000 (99.8709) acc5: 100.0000 (100.0000) time: 0.0409 data: 0.0003 max mem: 190
Epoch: [25] [130/468] eta: 0:00:14 lr: 0.0 img/s: 3372.6860574688094 loss: 0.5017 (0.5050) acc1: 100.0000 (99.8807) acc5: 100.0000 (100.0000) time: 0.0408 data: 0.0003 max mem: 190
Epoch: [25] [140/468] eta: 0:00:13 lr: 0.0 img/s: 3508.8226082637284 loss: 0.5019 (0.5051) acc1: 100.0000 (99.8726) acc5: 100.0000 (100.0000) time: 0.0401 data: 0.0004 max mem: 190
Epoch: [25] [150/468] eta: 0:00:13 lr: 0.0 img/s: 3403.971062459184 loss: 0.5014 (0.5051) acc1: 100.0000 (99.8707) acc5: 100.0000 (100.0000) time: 0.0385 data: 0.0004 max mem: 190
Epoch: [25] [160/468] eta: 0:00:12 lr: 0.0 img/s: 3460.245382007554 loss: 0.5012 (0.5052) acc1: 100.0000 (99.8690) acc5: 100.0000 (99.9951) time: 0.0382 data: 0.0003 max mem: 190
Epoch: [25] [170/468] eta: 0:00:12 lr: 0.0 img/s: 3412.3874149876056 loss: 0.5012 (0.5051) acc1: 100.0000 (99.8766) acc5: 100.0000 (99.9954) time: 0.0379 data: 0.0003 max mem: 190
Epoch: [25] [180/468] eta: 0:00:11 lr: 0.0 img/s: 3046.127945440206 loss: 0.5021 (0.5051) acc1: 100.0000 (99.8748) acc5: 100.0000 (99.9957) time: 0.0423 data: 0.0010 max mem: 190
Epoch: [25] [190/468] eta: 0:00:11 lr: 0.0 img/s: 3169.0065815492317 loss: 0.5019 (0.5049) acc1: 100.0000 (99.8773) acc5: 100.0000 (99.9959) time: 0.0466 data: 0.0016 max mem: 190
Epoch: [25] [200/468] eta: 0:00:11 lr: 0.0 img/s: 3170.6160978461794 loss: 0.5016 (0.5048) acc1: 100.0000 (99.8795) acc5: 100.0000 (99.9961) time: 0.0438 data: 0.0011 max mem: 190
Epoch: [25] [210/468] eta: 0:00:10 lr: 0.0 img/s: 3243.109978132438 loss: 0.5014 (0.5048) acc1: 100.0000 (99.8741) acc5: 100.0000 (99.9963) time: 0.0413 data: 0.0005 max mem: 190
Epoch: [25] [220/468] eta: 0:00:10 lr: 0.0 img/s: 3291.707513274228 loss: 0.5015 (0.5048) acc1: 100.0000 (99.8763) acc5: 100.0000 (99.9965) time: 0.0406 data: 0.0003 max mem: 190
Epoch: [25] [230/468] eta: 0:00:09 lr: 0.0 img/s: 3252.205986224777 loss: 0.5015 (0.5048) acc1: 100.0000 (99.8782) acc5: 100.0000 (99.9966) time: 0.0402 data: 0.0004 max mem: 190
Epoch: [25] [240/468] eta: 0:00:09 lr: 0.0 img/s: 3223.036818692105 loss: 0.5016 (0.5047) acc1: 100.0000 (99.8833) acc5: 100.0000 (99.9968) time: 0.0404 data: 0.0004 max mem: 190
Epoch: [25] [250/468] eta: 0:00:08 lr: 0.0 img/s: 3207.804019980402 loss: 0.5014 (0.5049) acc1: 100.0000 (99.8786) acc5: 100.0000 (99.9969) time: 0.0403 data: 0.0003 max mem: 190
Epoch: [25] [260/468] eta: 0:00:08 lr: 0.0 img/s: 3453.0123810932664 loss: 0.5013 (0.5049) acc1: 100.0000 (99.8803) acc5: 100.0000 (99.9970) time: 0.0397 data: 0.0003 max mem: 190
Epoch: [25] [270/468] eta: 0:00:08 lr: 0.0 img/s: 3455.723990550796 loss: 0.5015 (0.5050) acc1: 100.0000 (99.8789) acc5: 100.0000 (99.9971) time: 0.0384 data: 0.0003 max mem: 190
Epoch: [25] [280/468] eta: 0:00:07 lr: 0.0 img/s: 3354.436868939318 loss: 0.5015 (0.5050) acc1: 100.0000 (99.8804) acc5: 100.0000 (99.9972) time: 0.0377 data: 0.0003 max mem: 190