{"id":1912,"date":"2024-09-14T14:27:15","date_gmt":"2024-09-14T06:27:15","guid":{"rendered":"https:\/\/www.gnn.club\/?p=1912"},"modified":"2025-03-12T15:05:50","modified_gmt":"2025-03-12T07:05:50","slug":"encoder-decoder","status":"publish","type":"post","link":"http:\/\/www.gnn.club\/?p=1912","title":{"rendered":"\u7f16\u89e3\u7801\u5668\uff08Encoder-Decoder\uff09"},"content":{"rendered":"<h1><img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914154545886.png\" style=\"height:50px;display:inline\"> Deep Learning<\/h1>\n<hr \/>\n<p>create by Arwin Yu<\/p>\n<h2>Tutorial 04 - Encoder-Deconder<\/h2>\n<hr \/>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/bubbles\/50\/null\/checklist.png\" style=\"height:50px;display:inline\"> Agenda<\/h3>\n<hr \/>\n<ul>\n<li>\u81ea\u7f16\u7801\u5668\uff08Auto-encoder\uff09\n<ul>\n<li>\u6f5c\u7a7a\u95f4<\/li>\n<li>\u6781\u5927\u4f3c\u7136\u4f30\u8ba1<\/li>\n<li>\u9690\u53d8\u91cf\u6a21\u578b<\/li>\n<li>\u8499\u7279\u5361\u6d1b\u91c7\u6837<\/li>\n<li>\u53d8\u5206\u81ea\u7f16\u7801\u5668<\/li>\n<\/ul>\n<\/li>\n<li>\u5e8f\u5217\u5230\u5e8f\u5217\u6a21\u578b\uff08Seq2Seq\uff09\n<ul>\n<li>Seq2Seq \u6a21\u578b\u7ed3\u6784<\/li>\n<li>Seq2Seq \u7684\u6ce8\u610f\u529b\u673a\u5236<\/li>\n<\/ul>\n<\/li>\n<li>\u81ea\u76d1\u7763\u5b66\u4e60\uff08Self-Supervised Learning\uff09\n<ul>\n<li>Masked Autoencoders (Vision Transformers)<\/li>\n<li>Bert <\/li>\n<li>GPT<\/li>\n<\/ul>\n<\/li>\n<li>\u5bf9\u6bd4\u5b66\u4e60\uff08Contrastive Methods\uff09\n<ul>\n<li>Contrastive Predictive Coding (CPC)<\/li>\n<li>Simple Framework for Contrastive Learning of Visual Representations (SimCLR)<\/li>\n<li>Contrastive Language\u2013Image Pre-training\uff08CLIP\uff09<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/color\/96\/000000\/self-esteem.png\" style=\"height:50px;display:inline\"> \u81ea\u7f16\u7801\u5668\uff08Auto-encoders\uff09<\/h2>\n<hr \/>\n<p>\u81ea\u7f16\u7801\u5668\uff08Autoencoder\uff09\u662f\u4e00\u79cd\u7528\u4e8e\u65e0\u76d1\u7763\u5b66\u4e60\u7684\u795e\u7ecf\u7f51\u7edc\u67b6\u6784\uff0c\u4e3b\u8981\u7528\u4e8e\u6570\u636e\u538b\u7f29\u548c\u7279\u5f81\u5b66\u4e60\u3002\u5b83\u7531\u4e24\u4e2a\u4e3b\u8981\u90e8\u5206\u7ec4\u6210\uff1a\u7f16\u7801\u5668\uff08Encoder\uff09\u548c\u89e3\u7801\u5668\uff08Decoder\uff09\u3002\u81ea\u7f16\u7801\u5668\u5728\u8bb8\u591a\u9886\u57df\u4e2d\u90fd\u6709\u5e7f\u6cdb\u5e94\u7528\uff0c\u5305\u62ec\u56fe\u50cf\u5904\u7406\u3001\u81ea\u7136\u8bed\u8a00\u5904\u7406\u548c\u5f02\u5e38\u68c0\u6d4b\u7b49\u3002\u901a\u8fc7\u6709\u6548\u5730\u5b66\u4e60\u6570\u636e\u7684\u6f5c\u5728\u8868\u793a\uff0c\u81ea\u7f16\u7801\u5668\u4e3a\u6570\u636e\u5206\u6790\u548c\u7406\u89e3\u63d0\u4f9b\u4e86\u5f3a\u5927\u7684\u5de5\u5177\u3002<\/p>\n<p>\u4e0b\u9762\u4ee5\u56fe\u50cf\u751f\u6210\u95ee\u9898\u4e3a\u4f8b\u4ecb\u7ecd\u81ea\u7f16\u7801\u5668\uff1a<\/p>\n<ul>\n<li>\n<p>\u5927\u591a\u6570\u81ea\u7136\u6570\u636e\u90fd\u662f\u9ad8\u7ef4\u7684\uff0c\u4f8b\u5982\u56fe\u50cf\u3002\u8003\u8651 MNIST\uff08\u624b\u5199\u6570\u5b57\uff09\u6570\u636e\u96c6\uff0c\u5176\u4e2d\u6bcf\u5e45\u56fe\u50cf\u6709 $28x28=784$ \u4e2a\u50cf\u7d20\uff0c\u8fd9\u610f\u5473\u7740\u5b83\u53ef\u4ee5\u7528\u957f\u5ea6\u4e3a 784 \u7684\u5411\u91cf\u8868\u793a\u3002<\/p>\n<\/li>\n<li>\n<p>\u4f46\u6211\u4eec\u771f\u7684\u9700\u8981 784 \u4e2a\u503c\u6765\u8868\u793a\u4e00\u4e2a\u6570\u5b57\u5417\uff1f\u7b54\u6848\u53ef\u80fd\u662f\u5426\u5b9a\u7684\u3002\u6211\u4eec\u8ba4\u4e3a\u6570\u636e\u4f4d\u4e8e\u4f4e\u7ef4\u7a7a\u95f4\u4e2d\uff0c\u8db3\u4ee5\u63cf\u8ff0\u89c2\u5bdf\u7ed3\u679c\u3002\u5728 MNIST \u7684\u60c5\u51b5\u4e0b\uff0c\u6211\u4eec\u53ef\u4ee5\u9009\u62e9\u5c06\u6570\u5b57\u8868\u793a\u4e3a\u72ec\u70ed\u5411\u91cf\uff0c\u8fd9\u610f\u5473\u7740\u6211\u4eec\u53ea\u9700\u8981 10 \u4e2a\u7ef4\u5ea6\u3002\u56e0\u6b64\uff0c\u6211\u4eec\u53ef\u4ee5\u5728\u4f4e\u7ef4\u7a7a\u95f4\u4e2d<strong>\u7f16\u7801<\/strong>\u9ad8\u7ef4\u89c2\u5bdf\u7ed3\u679c\u3002<\/p>\n<\/li>\n<li>\n<p>\u4f46\u6211\u4eec\u5982\u4f55\u624d\u80fd\u5b66\u4e60\u6709\u610f\u4e49\u7684\u4f4e\u7ef4\u8868\u793a\uff1f\u4e00\u822c\u7684\u60f3\u6cd5\u662f\u91cd\u5efa\u6216<strong>\u89e3\u7801<\/strong>\u4f4e\u7ef4\u8868\u793a\u4e3a\u9ad8\u7ef4\u8868\u793a\uff0c\u5e76\u4f7f\u7528\u91cd\u5efa\u8bef\u5dee\u6765\u627e\u5230\u6700\u4f73\u8868\u793a\uff08\u4f7f\u7528\u8bef\u5dee\u7684\u68af\u5ea6\uff09\u3002\u8fd9\u662f<strong>\u81ea\u52a8\u7f16\u7801\u5668<\/strong>\u80cc\u540e\u7684\u6838\u5fc3\u601d\u60f3\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u81ea\u52a8\u7f16\u7801\u5668<\/strong> - \u5c06\u6570\u636e\u4f5c\u4e3a\u8f93\u5165\u5e76\u53d1\u73b0\u8be5\u6570\u636e\u7684\u4e00\u4e9b\u6f5c\u5728\u72b6\u6001\u8868\u793a\u7684\u6a21\u578b\u3002\u8f93\u5165\u6570\u636e\u88ab\u8f6c\u6362\u4e3a\u7f16\u7801\u5411\u91cf\uff0c\u5176\u4e2d\u6bcf\u4e2a\u7ef4\u5ea6\u4ee3\u8868\u6709\u5173\u6570\u636e\u7684\u4e00\u4e9b\u5b66\u4e60\u5c5e\u6027\u3002\u8fd9\u91cc\u8981\u638c\u63e1\u7684\u6700\u91cd\u8981\u7684\u7ec6\u8282\u662f\u6211\u4eec\u7684\u7f16\u7801\u5668\u7f51\u7edc\u4e3a\u6bcf\u4e2a\u7f16\u7801\u7ef4\u5ea6\u8f93\u51fa\u4e00\u4e2a\u503c\u3002\u7136\u540e\uff0c\u89e3\u7801\u5668\u7f51\u7edc\u968f\u540e\u83b7\u53d6\u8fd9\u4e9b\u503c\u5e76\u5c1d\u8bd5\u91cd\u65b0\u521b\u5efa\u539f\u59cb\u8f93\u5165\u3002\u81ea\u52a8\u7f16\u7801\u5668\u6709<strong>\u4e09\u4e2a\u90e8\u5206<\/strong>\uff1a\u7f16\u7801\u5668\u3001\u89e3\u7801\u5668\u548c\u5c06\u4e00\u4e2a\u90e8\u5206\u6620\u5c04\u5230\u53e6\u4e00\u4e2a\u90e8\u5206\u7684\u201c\u635f\u5931\u201d\u51fd\u6570\u3002\u5bf9\u4e8e\u6700\u7b80\u5355\u7684\u81ea\u52a8\u7f16\u7801\u5668\uff08\u5373\u538b\u7f29\u7136\u540e\u4ece\u538b\u7f29\u8868\u793a\u4e2d\u91cd\u5efa\u539f\u59cb\u8f93\u5165\u7684\u90a3\u79cd\uff09\uff0c\u6211\u4eec\u53ef\u4ee5\u5c06\u201c\u635f\u5931\u201d\u89c6\u4e3a\u63cf\u8ff0\u91cd\u5efa\u8fc7\u7a0b\u4e2d\u4e22\u5931\u7684\u4fe1\u606f\u91cf\u3002<\/p>\n<p align=\"center\">\n<img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914154726353.png\" style=\"height:300px\">\n<\/p>\n<\/li>\n<li>\n<p>The basic architecture of an autoencoder:<\/p>\n<p align=\"center\">\n<img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914154800581.png\" style=\"height:250px\">\n<\/p>\n<\/li>\n<\/ul>\n<p>Let's implement it in PyTorch using what we have learnt so far!<\/p>\n<pre><code class=\"language-python\">import torch.nn as nn\nclass AutoEncoder(nn.Module):\n\n    def __init__(self, input_dim=28*28, hidden_dim=256, latent_dim=10):\n        super(AutoEncoder, self).__init__()\n\n        self.input_dim = input_dim\n        self.hidden_dim = hidden_dim\n        self.latent_dim = latent_dim\n\n        # define the encoder\n        self.encoder = nn.Sequential(nn.Linear(self.input_dim, self.hidden_dim),\n                                     nn.ReLU(), nn.Linear(self.hidden_dim, self.hidden_dim),\n                                     nn.ReLU(),\n                                     nn.Linear(self.hidden_dim, self.latent_dim)\n                                    )\n\n        # define decoder\n        self.decoder = nn.Sequential(nn.Linear(self.latent_dim, self.hidden_dim),\n                                     nn.ReLU(),\n                                     nn.Linear(self.hidden_dim, self.hidden_dim),\n                                     nn.ReLU(),\n                                     nn.Linear(self.hidden_dim, self.input_dim),\n                                     nn.Sigmoid())\n\n    def forward(self,x):\n        x = self.encoder(x)\n        x = self.decoder(x)\n        return x\n\n    def get_latent_rep(self, x):\n        return self.encoder(x)<\/code><\/pre>\n<pre><code class=\"language-python\">import torch\n# hyper-parameters:\nnum_epochs = 5\nlearning_rate = 0.001\n\n# Device configuration, as before\ndevice = torch.device(&#039;cuda:0&#039; if torch.cuda.is_available() else &#039;cpu&#039;)\n\n# create model, send it to device\nmodel = AutoEncoder(input_dim=28*28, hidden_dim=128, latent_dim=10).to(device)\n\n# Loss and optimizer\ncriterion = nn.BCELoss()  # binary cross entropy, as pixels are in [0,1], can also use MSE\noptimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)<\/code><\/pre>\n<pre><code class=\"language-python\">import torch\nimport torch.nn as nn\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\nfrom torch.utils.data import DataLoader\n\n# Hyperparameters\nbatch_size = 8\nnum_epochs = 5\nlearning_rate = 0.001\n\n# Device configuration\ndevice = torch.device(&#039;cuda&#039; if torch.cuda.is_available() else &#039;cpu&#039;)\n\n# Dataset and DataLoader\ntransform = transforms.Compose([transforms.ToTensor()])\n\nfmnist_train_dataset = datasets.FashionMNIST(root=&#039;.\/data&#039;, train=True, transform=transform, download=True)\nfmnist_train_loader = DataLoader(dataset=fmnist_train_dataset, batch_size=batch_size, shuffle=True)\n\nfmnist_test_dataset = datasets.FashionMNIST(root=&#039;.\/data&#039;, train=False, transform=transform, download=True)\nfmnist_test_loader = DataLoader(dataset=fmnist_test_dataset, batch_size=batch_size, shuffle=True)\n<\/code><\/pre>\n<pre><code class=\"language-python\"># Train the model\ntotal_step = len(fmnist_train_loader)\nfor epoch in range(num_epochs):\n    for i, (images, labels) in enumerate(fmnist_train_loader):\n        # each i is a batch of 128 samples\n        images = images.to(device).view(batch_size, -1)\n\n        # Forward pass\n        outputs = model(images)\n        loss = criterion(outputs, images)\n\n        # Backward and optimize - ALWAYS IN THIS ORDER!\n        optimizer.zero_grad()\n        loss.backward()\n        optimizer.step()\n\n        if (i + 1) % 100 == 0:\n            print (&#039;Epoch [{}\/{}], Step [{}\/{}], Loss: {:.4f}&#039; \n                   .format(epoch + 1, num_epochs, i + 1, total_step, loss.item()))<\/code><\/pre>\n<pre><code>Epoch [1\/5], Step [100\/7500], Loss: 0.3949\nEpoch [1\/5], Step [200\/7500], Loss: 0.3437\nEpoch [1\/5], Step [300\/7500], Loss: 0.3984\n... ...\nEpoch [5\/5], Step [7500\/7500], Loss: 0.2854<\/code><\/pre>\n<pre><code class=\"language-python\">import matplotlib.pyplot as plt\n# let&#039;s see some of the reconstructions\nmodel.eval()  # put in evaluation mode - no gradients\nexamples = enumerate(fmnist_test_loader)\nbatch_idx, (example_data, example_targets) = next(examples)\nprint(&quot;shape: \\n&quot;, example_data.shape)\nfig = plt.figure()\nfor i in range(3):\n    ax = fig.add_subplot(2,3,i+1)\n    ax.imshow(example_data[i][0], cmap=&#039;gray&#039;, interpolation=&#039;none&#039;)\n    ax.set_title(&quot;Ground Truth: {}&quot;.format(example_targets[i]))\n    ax.set_axis_off()\n\n    ax = fig.add_subplot(2,3,i+4)\n    recon_img = model(example_data[i][0].view(1, -1).to(device)).data.cpu().numpy().reshape(28, 28)\n    ax.imshow(recon_img, cmap=&#039;gray&#039;)\n    ax.set_title(&quot;Reconstruction of: {}&quot;.format(example_targets[i]))\n    ax.set_axis_off()\nplt.tight_layout()<\/code><\/pre>\n<pre><code>shape: \n torch.Size([8, 1, 28, 28])<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914154836129.png\" style=\"height:400px\">\n<\/p>\n<p>\u81ea\u7f16\u7801\u5668\u7528\u4e8e\u5c06\u9ad8\u7ef4\u6570\u636e\uff08\u5982\u56fe\u50cf\uff09\u538b\u7f29\u5230\u4f4e\u7ef4\u7684\u6f5c\u5728\u7a7a\u95f4\uff0c\u8fd9\u4e2a\u6f5c\u5728\u7a7a\u95f4\u7684\u8868\u793a\u5e94\u6355\u6349\u5230\u6570\u636e\u7684\u91cd\u8981\u7279\u5f81\u3002<\/p>\n<p>\u5728\u8fd9\u4e2a\u6f5c\u5728\u8868\u793a\u7a7a\u95f4\u4e0a\uff0c\u53ef\u4ee5\u5e94\u7528\u4e0d\u540c\u7684\u964d\u7ef4\u65b9\u6cd5\u8fdb\u4e00\u6b65\u5c06\u6570\u636e\u4ece\u81ea\u7f16\u7801\u5668\u7684\u6f5c\u5728\u7a7a\u95f4\u964d\u7ef4\u5230\u53ef\u89c6\u5316\u7684\u4e8c\u7ef4\u7a7a\u95f4\u3002<\/p>\n<pre><code class=\"language-python\"># let&#039;s compare different dimensionality reduction methods\nn_neighbors = 10\nn_components = 2\nn_points= 500\n\nfmnist_test_loader = torch.utils.data.DataLoader(dataset=fmnist_test_dataset,\n                                          batch_size=n_points, \n                                          shuffle=False)\nX, labels = next(iter(fmnist_test_loader))\nlatent_X = model.get_latent_rep(X.to(device).view(n_points, -1)).data.cpu().numpy()\nlabels = labels.data.cpu().numpy()<\/code><\/pre>\n<pre><code class=\"language-python\">from sklearn.decomposition import PCA, KernelPCA\nfrom sklearn.manifold import LocallyLinearEmbedding, Isomap, TSNE\nimport numpy as np\nimport time\nimport matplotlib.pyplot as plt\nfrom mpl_toolkits.mplot3d import Axes3D\n\nfig = plt.figure(figsize=(15, 8))\n\n# PCA\nt0 = time.time()\nx_pca = PCA(n_components=3).fit_transform(latent_X)\nt1 = time.time()\nprint(&quot;PCA time: %.2g sec&quot; % (t1 - t0))\nax = fig.add_subplot(2, 3, 1, projection=&#039;3d&#039;)\nax.scatter(x_pca[:, 0], x_pca[:, 1], x_pca[:, 2], c=labels, cmap=plt.cm.Spectral)\nax.set_title(&#039;PCA&#039;)\n\n# KPCA\nt0 = time.time()\nx_kpca = KernelPCA(n_components=3, kernel=&#039;rbf&#039;).fit_transform(latent_X)\nt1 = time.time()\nprint(&quot;KPCA time: %.2g sec&quot; % (t1 - t0))\nax = fig.add_subplot(2, 3, 2, projection=&#039;3d&#039;)\nax.scatter(x_kpca[:, 0], x_kpca[:, 1], x_kpca[:, 2], c=labels, cmap=plt.cm.Spectral)\nax.set_title(&#039;KernelPCA&#039;)\n\n# LLE\nt0 = time.time()\nx_lle = LocallyLinearEmbedding(n_neighbors=n_neighbors, n_components=3, eigen_solver=&#039;auto&#039;).fit_transform(latent_X)\nt1 = time.time()\nprint(&quot;LLE time: %.2g sec&quot; % (t1 - t0))\nax = fig.add_subplot(2, 3, 3, projection=&#039;3d&#039;)\nax.scatter(x_lle[:, 0], x_lle[:, 1], x_lle[:, 2], c=labels, cmap=plt.cm.Spectral)\nax.set_title(&#039;LLE&#039;)\n\n# Isomap\nt0 = time.time()\nx_isomap = Isomap(n_neighbors=n_neighbors, n_components=3).fit_transform(latent_X)\nt1 = time.time()\nprint(&quot;Isomap time: %.2g sec&quot; % (t1 - t0))\nax = fig.add_subplot(2, 3, 4, projection=&#039;3d&#039;)\nax.scatter(x_isomap[:, 0], x_isomap[:, 1], x_isomap[:, 2], c=labels, cmap=plt.cm.Spectral)\nax.set_title(&#039;Isomap&#039;)\n\n# t-SNE\nt0 = time.time()\nx_tsne = TSNE(n_components=3).fit_transform(latent_X)\nt1 = time.time()\nprint(&quot;t-SNE time: %.2g sec&quot; % (t1 - t0))\nax = fig.add_subplot(2, 3, 5, projection=&#039;3d&#039;)\nscatter = ax.scatter(x_tsne[:, 0], x_tsne[:, 1], x_tsne[:, 2], c=labels, cmap=plt.cm.Spectral)\nax.set_title(&#039;t-SNE&#039;)\n\nbounds = np.linspace(0, 10, 11)\ncb = fig.colorbar(scatter, ax=ax, spacing=&#039;proportional&#039;, ticks=bounds)\ncb.set_label(&#039;Classes Colors&#039;)\n\nplt.tight_layout()\nplt.show()\n<\/code><\/pre>\n<pre><code>PCA time: 0.0012 sec\nKPCA time: 0.011 sec\nLLE time: 0.056 sec\nIsomap time: 0.1 sec\nt-SNE time: 2.7 sec<\/code><\/pre>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914154909157.png\" style=\"height:400px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=18519&format=png&color=000000\" style=\"height:50px;display:inline\"> \u6f5c\u7a7a\u95f4\uff08latent space\uff09<\/h3>\n<hr \/>\n<p>\u5728\u6df1\u5ea6\u5b66\u4e60\u4e2d\uff0c\u7f16\u7801\u5668\u901a\u5e38\u662f\u4e00\u4e2a\u795e\u7ecf\u7f51\u7edc\uff0c\u5b83\u901a\u8fc7\u5b66\u4e60\u5c06\u9ad8\u7ef4\u8f93\u5165\u6570\u636e\uff08\u5982\u56fe\u50cf\u3001\u97f3\u9891\u6216\u6587\u672c\uff09\u8f6c\u6362\u4e3a\u6f5c\u7a7a\u95f4\u4e2d\u7684\u4f4e\u7ef4\u5411\u91cf\u8868\u793a\u3002\u8fd9\u4e2a\u4f4e\u7ef4\u5411\u91cf\u8868\u793a\u6355\u6349\u4e86\u8f93\u5165\u6570\u636e\u7684\u91cd\u8981\u7279\u5f81\u548c\u7ed3\u6784\uff0c\u5176\u4e2d\u6bcf\u4e2a\u7ef4\u5ea6\u53ef\u80fd\u5bf9\u5e94\u7740\u6570\u636e\u7684\u67d0\u4e2a\u62bd\u8c61\u7279\u5f81\u3002<\/p>\n<p>\u6f5c\u7a7a\u95f4\u5177\u6709\u4e00\u4e9b\u6709\u7528\u7684\u5c5e\u6027\u3002<\/p>\n<ul>\n<li>\u9996\u5148\uff0c\u6f5c\u7a7a\u95f4\u5177\u6709\u8f83\u4f4e\u7684\u7ef4\u5ea6\uff0c\u56e0\u6b64\u53ef\u4ee5\u66f4\u6709\u6548\u5730\u8868\u793a\u6570\u636e\uff0c\u5e76\u4e14\u53ef\u4ee5\u51cf\u5c11\u5197\u4f59\u4fe1\u606f\u3002<\/li>\n<li>\u5176\u6b21\uff0c\u6f5c\u7a7a\u95f4\u7684\u5411\u91cf\u53ef\u4ee5\u8fdb\u884c\u6570\u5b66\u8fd0\u7b97\uff0c\u4f8b\u5982\u5411\u91cf\u52a0\u51cf\u6cd5\uff0c\u8fd9\u79cd\u8fd0\u7b97\u5728\u6f5c\u7a7a\u95f4\u4e2d\u5bf9\u5e94\u7740\u5bf9\u8f93\u5165\u6570\u636e\u7684\u8bed\u4e49\u64cd\u4f5c\uff0c\u4f8b\u5982\u5728\u56fe\u50cf\u4e2d\u6dfb\u52a0\u6216\u53bb\u9664\u7279\u5b9a\u7279\u5f81\u3002\u8fd9\u4f7f\u5f97\u6f5c\u7a7a\u95f4\u6210\u4e3a\u751f\u6210\u6a21\u578b\u548c\u91cd\u6784\u6a21\u578b\u7684\u91cd\u8981\u7ec4\u6210\u90e8\u5206\u3002<\/li>\n<\/ul>\n<p>\u6f5c\u7a7a\u95f4\u5728\u8bb8\u591a\u673a\u5668\u5b66\u4e60\u4efb\u52a1\u4e2d\u90fd\u53d1\u6325\u7740\u91cd\u8981\u4f5c\u7528\uff0c\u5305\u62ec\u56fe\u50cf\u751f\u6210\u3001\u56fe\u50cf\u91cd\u6784\u3001\u7279\u5f81\u63d0\u53d6\u548c\u6570\u636e\u538b\u7f29\u7b49\u3002\u901a\u8fc7\u5b66\u4e60\u6f5c\u7a7a\u95f4\u7684\u7ed3\u6784\u548c\u7279\u5f81\uff0c\u53ef\u4ee5\u5b9e\u73b0\u66f4\u9ad8\u7ea7\u522b\u7684\u6570\u636e\u5206\u6790\u548c\u64cd\u4f5c\u3002<\/p>\n<p>\u4e00\u4e2a\u597d\u7684\u6f5c\u7a7a\u95f4\u5e94\u8be5\u5177\u5907\u4ee5\u4e0b\u51e0\u4e2a\u7279\u70b9\uff1a<\/p>\n<ol>\n<li>\n<p><strong>\u6709\u610f\u4e49\u7684\u8868\u793a<\/strong>\uff1a\u6f5c\u7a7a\u95f4\u4e2d\u7684\u6bcf\u4e2a\u7ef4\u5ea6\u5e94\u8be5\u5bf9\u5e94\u7740\u8f93\u5165\u6570\u636e\u7684\u67d0\u4e2a\u6709\u610f\u4e49\u7684\u7279\u5f81\u3002\u8fd9\u610f\u5473\u7740\u76f8\u4f3c\u7684\u6570\u636e\u5728\u6f5c\u7a7a\u95f4\u4e2d\u5e94\u8be5\u66f4\u63a5\u8fd1\uff0c\u800c\u4e0d\u76f8\u5173\u7684\u6570\u636e\u5e94\u8be5\u66f4\u8fdc\u79bb\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u9886\u57df\uff0c\u6f5c\u7a7a\u95f4\u7684\u67d0\u4e2a\u7ef4\u5ea6\u53ef\u4ee5\u8868\u793a\u56fe\u50cf\u4e2d\u7684\u989c\u8272\uff0c\u53e6\u4e00\u4e2a\u7ef4\u5ea6\u53ef\u4ee5\u8868\u793a\u5f62\u72b6\u3002\u8fd9\u79cd\u6709\u610f\u4e49\u7684\u8868\u793a\u4f7f\u5f97\u5728\u6f5c\u7a7a\u95f4\u4e2d\u7684\u8fd0\u7b97\u548c\u64cd\u4f5c\u66f4\u52a0\u76f4\u89c2\u548c\u53ef\u89e3\u91ca\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u4f4e\u7ef4\u5ea6<\/strong>\uff1a\u6f5c\u7a7a\u95f4\u7684\u7ef4\u5ea6\u5e94\u8be5\u76f8\u5bf9\u8f83\u4f4e\uff0c\u4ee5\u4fbf\u6709\u6548\u5730\u8868\u793a\u6570\u636e\u5e76\u51cf\u5c11\u5197\u4f59\u4fe1\u606f\u3002\u901a\u8fc7\u5c06\u9ad8\u7ef4\u6570\u636e\u6620\u5c04\u5230\u4f4e\u7ef4\u7a7a\u95f4\uff0c\u53ef\u4ee5\u63d0\u53d6\u6570\u636e\u4e2d\u6700\u91cd\u8981\u7684\u7279\u5f81\uff0c\u5e76\u4e14\u53ef\u4ee5\u66f4\u9ad8\u6548\u5730\u8fdb\u884c\u8ba1\u7b97\u548c\u64cd\u4f5c\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u8fde\u7eed\u6027<\/strong>\uff1a\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u5e94\u8be5\u5177\u6709\u8fde\u7eed\u6027\uff0c\u5373\u5728\u6f5c\u7a7a\u95f4\u4e2d\u76f8\u90bb\u7684\u5411\u91cf\u5e94\u8be5\u5bf9\u5e94\u7740\u5728\u8f93\u5165\u7a7a\u95f4\u4e2d\u76f8\u4f3c\u7684\u6570\u636e\u3002\u8fd9\u79cd\u8fde\u7eed\u6027\u4f7f\u5f97\u5728\u6f5c\u7a7a\u95f4\u4e2d\u8fdb\u884c\u63d2\u503c\u6216\u63d2\u5165\u65b0\u7684\u5411\u91cf\u65f6\uff0c\u80fd\u591f\u4ea7\u751f\u5408\u7406\u548c\u5e73\u6ed1\u7684\u7ed3\u679c\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u751f\u6210\u4efb\u52a1\u4e2d\uff0c\u901a\u8fc7\u5728\u6f5c\u7a7a\u95f4\u4e2d\u5bf9\u4e24\u4e2a\u4e0d\u540c\u7684\u5411\u91cf\u8fdb\u884c\u7ebf\u6027\u63d2\u503c\uff0c\u53ef\u4ee5\u751f\u6210\u4e00\u4e2a\u4ecb\u4e8e\u5b83\u4eec\u4e4b\u95f4\u7684\u65b0\u56fe\u50cf\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u53ef\u64cd\u4f5c\u6027<\/strong>\uff1a\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u5e94\u8be5\u5177\u6709\u53ef\u64cd\u4f5c\u6027\uff0c\u5373\u53ef\u4ee5\u901a\u8fc7\u5bf9\u5411\u91cf\u8fdb\u884c\u6570\u5b66\u8fd0\u7b97\u6765\u5b9e\u73b0\u5bf9\u8f93\u5165\u6570\u636e\u7684\u8bed\u4e49\u64cd\u4f5c\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u751f\u6210\u4efb\u52a1\u4e2d\uff0c\u53ef\u4ee5\u901a\u8fc7\u5728\u6f5c\u7a7a\u95f4\u4e2d\u5bf9\u67d0\u4e2a\u5411\u91cf\u7684\u7279\u5b9a\u7ef4\u5ea6\u8fdb\u884c\u589e\u51cf\u64cd\u4f5c\uff0c\u6765\u6539\u53d8\u751f\u6210\u56fe\u50cf\u4e2d\u7684\u67d0\u4e2a\u7279\u5f81\uff0c\u5982\u989c\u8272\u3001\u5f62\u72b6\u7b49\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u4e00\u81f4\u6027<\/strong>\uff1a\u6f5c\u7a7a\u95f4\u5e94\u8be5\u5728\u4e0d\u540c\u7684\u8f93\u5165\u6570\u636e\u4e4b\u95f4\u4fdd\u6301\u4e00\u81f4\uff0c\u5373\u76f8\u540c\u7c7b\u578b\u7684\u6570\u636e\u5728\u6f5c\u7a7a\u95f4\u4e2d\u5e94\u8be5\u6709\u76f8\u4f3c\u7684\u8868\u793a\u3002\u8fd9\u6837\u53ef\u4ee5\u786e\u4fdd\u6f5c\u7a7a\u95f4\u7684\u6cdb\u5316\u80fd\u529b\uff0c\u4f7f\u5f97\u76f8\u4f3c\u7684\u6570\u636e\u5177\u6709\u76f8\u4f3c\u7684\u8868\u793a\uff0c\u800c\u4e0d\u540c\u7c7b\u522b\u7684\u6570\u636e\u6709\u660e\u663e\u7684\u533a\u5206\u3002<\/p>\n<\/li>\n<\/ol>\n<p>\u4e00\u4e2a\u597d\u7684\u6f5c\u7a7a\u95f4\u8bbe\u8ba1\u53ef\u4ee5\u4f7f\u5f97\u5728\u8be5\u7a7a\u95f4\u4e2d\u7684\u6570\u636e\u8868\u793a\u66f4\u52a0\u6709\u6548\u3001\u6709\u610f\u4e49\uff0c\u5e76\u4e14\u53ef\u4ee5\u652f\u6301\u5404\u79cd\u4efb\u52a1\uff0c\u5305\u62ec\u751f\u6210\u3001\u91cd\u6784\u3001\u63d2\u503c\u548c\u8bed\u4e49\u64cd\u4f5c\u7b49\u3002\u6f5c\u7a7a\u95f4\u548c\u56fe\u50cf\u751f\u6210\u6a21\u578b\u4e4b\u95f4\u6709\u5bc6\u5207\u7684\u5173\u7cfb\uff0c\u6f5c\u7a7a\u95f4\u662f\u56fe\u50cf\u751f\u6210\u6a21\u578b\u7684\u5173\u952e\u7ec4\u6210\u90e8\u5206\u4e4b\u4e00\u3002<\/p>\n<ul>\n<li>\n<p>\u751f\u6210\u6a21\u578b\u65e8\u5728\u4ece\u6f5c\u7a7a\u95f4\u4e2d\u751f\u6210\u903c\u771f\u7684\u56fe\u50cf\u3002\u8fd9\u4e9b\u6a21\u578b\u901a\u5e38\u4f7f\u7528\u751f\u6210\u5bf9\u6297\u7f51\u7edc\uff0c\u6269\u6563\u6a21\u578b\u6216\u53d8\u5206\u81ea\u7f16\u7801\u5668\u7b49\u65b9\u6cd5\u3002<\/p>\n<\/li>\n<li>\n<p>\u4ee5\u56fe\u8c61\u751f\u6210\u4e3a\u4f8b\uff0c\u5728\u8fd9\u4e9b\u6a21\u578b\u4e2d\uff0c\u4e00\u4e2a\u91cd\u8981\u7684\u6b65\u9aa4\u662f\u5c06\u8f93\u5165\u6620\u5c04\u5230\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u8868\u793a\uff0c\u8fd9\u4e2a\u8fc7\u7a0b\u901a\u5e38\u7531\u7f16\u7801\u5668\u5b8c\u6210\u3002<\/p>\n<\/li>\n<li>\n<p>\u7f16\u7801\u5668\u5c06\u8f93\u5165\u56fe\u50cf\u8f6c\u6362\u4e3a\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u8868\u793a\uff0c\u5176\u4e2d\u6bcf\u4e2a\u5411\u91cf\u7ef4\u5ea6\u5bf9\u5e94\u7740\u56fe\u50cf\u7684\u67d0\u4e2a\u7279\u5f81\u3002\u8fd9\u4e2a\u5411\u91cf\u53ef\u4ee5\u88ab\u770b\u4f5c\u662f\u56fe\u50cf\u7684\u9690\u542b\u8868\u793a\u6216\u7279\u5f81\u5411\u91cf\u3002\u6b64\u7279\u5f81\u5411\u91cf\u53ef\u4ee5\u88ab\u7528\u4e8e\u8fdb\u884c\u5404\u79cd\u56fe\u50cf\u64cd\u4f5c\uff0c\u4f8b\u5982\u751f\u6210\u65b0\u7684\u56fe\u50cf\u3001\u91cd\u6784\u539f\u59cb\u56fe\u50cf\u6216\u8005\u5728\u6f5c\u7a7a\u95f4\u4e2d\u8fdb\u884c\u63d2\u503c\u64cd\u4f5c\u3002<\/p>\n<\/li>\n<li>\n<p>\u751f\u6210\u6a21\u578b\u7684\u53e6\u4e00\u90e8\u5206\u662f\u89e3\u7801\u5668\uff0c\u5b83\u7684\u4efb\u52a1\u662f\u5c06\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u8f6c\u6362\u56de\u56fe\u50cf\u7a7a\u95f4\u3002\u89e3\u7801\u5668\u63a5\u6536\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\uff0c\u5e76\u5c06\u5176\u89e3\u7801\u4e3a\u903c\u771f\u7684\u56fe\u50cf\u3002\u8fd9\u4e2a\u8fc7\u7a0b\u53ef\u4ee5\u88ab\u89c6\u4e3a\u5bf9\u6f5c\u7a7a\u95f4\u5411\u91cf\u7684\u9006\u6620\u5c04\u3002<\/p>\n<\/li>\n<li>\n<p>\u901a\u8fc7\u8bad\u7ec3\u751f\u6210\u6a21\u578b\uff0c\u53ef\u4ee5\u5b66\u4e60\u5230\u4e00\u4e2a\u4f18\u5316\u7684\u6f5c\u7a7a\u95f4\u8868\u793a\uff0c\u5176\u4e2d\u6f5c\u7a7a\u95f4\u4e2d\u7684\u5411\u91cf\u53ef\u4ee5\u88ab\u89e3\u7801\u6210\u9ad8\u8d28\u91cf\u7684\u56fe\u50cf\u3002\u4ece\u800c\u53ef\u4ee5\u5728\u6f5c\u7a7a\u95f4\u4e2d\u8fdb\u884c\u56fe\u50cf\u64cd\u4f5c\uff0c\u4f8b\u5982\u901a\u8fc7\u5728\u6f5c\u7a7a\u95f4\u4e2d\u8c03\u6574\u7279\u5b9a\u7ef4\u5ea6\u7684\u503c\u6765\u6539\u53d8\u56fe\u50cf\u7684\u7279\u5f81\uff0c\u6216\u8005\u901a\u8fc7\u5728\u6f5c\u7a7a\u95f4\u4e2d\u8fdb\u884c\u63d2\u503c\u6765\u751f\u6210\u4ecb\u4e8e\u4e24\u4e2a\u5411\u91cf\u4e4b\u95f4\u7684\u65b0\u56fe\u50cf\u3002<\/p>\n<\/li>\n<\/ul>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=32601&format=png&color=000000\" style=\"height:50px;display:inline\"> \u6781\u5927\u4f3c\u7136\u4f30\u8ba1\uff08Maximum Likelihood Estimation\uff0cMLE\uff09<\/h3>\n<hr \/>\n<p>\u751f\u6210\u6a21\u578b\u7684\u672c\u8d28\u662f\u6781\u5927\u4f3c\u7136\u4f30\u8ba1\u3002\u6211\u4eec\u5e0c\u671b\u901a\u8fc7\u6709\u9650\u7684\u6837\u672c\u6765\u4f30\u8ba1\u603b\u4f53\u7684\u6f5c\u5728\u8868\u793a\uff0c\u5373\u6570\u636e\u7684\u771f\u5b9e\u5206\u5e03 $p(x)$\u3002\u5047\u8bbe\u6837\u672c\u6570\u636e $p(x)$ \u670d\u4ece\u67d0\u79cd\u5206\u5e03\u3002\u5bf9\u4e8e\u8fde\u7eed\u6570\u636e\uff0c\u5e38\u7528\u7684\u5206\u5e03\u662f\u9ad8\u65af\u5206\u5e03 $\\mathcal{N}\\left(\\mu, \\sigma^2\\right)$\u3002\u8fd9\u91cc\uff0c$p(x)$ \u4ee3\u8868\u4e86\u6211\u4eec\u5e0c\u671b\u901a\u8fc7\u6a21\u578b\u53bb\u8fd1\u4f3c\u548c\u751f\u6210\u7684\u771f\u5b9e\u6570\u636e\u7684\u5206\u5e03\u3002<\/p>\n<p>\u4e00\u65e6\u5047\u8bbe\u4e86 $p(x)$ \u7684\u5206\u5e03\u5f62\u5f0f\uff0c\u6982\u7387\u5206\u5e03\u4f30\u8ba1\u95ee\u9898\u5c31\u8f6c\u5316\u4e3a\u4e86\u53c2\u6570\u4f30\u8ba1\u95ee\u9898\u3002\u8054\u5408\u6982\u7387\u5206\u5e03 $p(x; \\theta)$ \u53ef\u4ee5\u8868\u793a\u4e3a\uff1a<br \/>\n$$<br \/>\nL(\\theta ; X)=P(X ; \\theta)=\\prod_{i=1}^n p\\left(x_i ; \\theta\\right)<br \/>\n$$<\/p>\n<p>\u5176\u4e2d\uff0c$\\theta$ \u53ef\u4ee5\u770b\u4f5c\u662f\u6a21\u578b\u7684\u8bad\u7ec3\u53c2\u6570\uff0c\u4e0a\u8ff0\u8868\u8fbe\u5f0f\u4e5f\u88ab\u79f0\u4e3a\u4f3c\u7136\u51fd\u6570\u3002\u6700\u5927\u5316\u4f3c\u7136\u51fd\u6570\u7b49\u540c\u4e8e\u6700\u5c0f\u5316\u8d1f\u5bf9\u6570\u4f3c\u7136\u51fd\u6570\u3002\u53d6\u5bf9\u6570\u662f\u4e3a\u4e86\u5c06\u8fde\u4e58\u7b26\u53f7\u53d8\u4e3a\u8fde\u52a0\u7b26\u53f7\uff0c\u65b9\u4fbf\u8ba1\u7b97\uff0c\u516c\u5f0f\u5982\u4e0b\uff1a<br \/>\n$$<br \/>\n\\hat{\\theta}=\\arg \\max _\\theta L(\\theta ; X)=\\arg \\min _\\theta-\\sum_{i=1}^n \\log p\\left(x_i ; \\theta\\right)<br \/>\n$$<br \/>\n\u5728\u4f18\u5316\u7684\u8fc7\u7a0b\u4e2d\u9700\u8981\u6c42\u89e3\u5173\u4e8e\u53c2\u6570 $\\theta$ \u7684\u68af\u5ea6:<br \/>\n$$<br \/>\n\\nabla_\\theta L(\\theta ; X)=-\\nabla_\\theta \\sum_{i=1}^n \\log p\\left(x_i ; \\theta\\right)=-\\sum_{i=1}^n \\nabla_\\theta \\log p\\left(x_i ; \\theta\\right)<br \/>\n$$<\/p>\n<p>\u6309\u7167\u68af\u5ea6\u4e0b\u964d\u7684\u65b9\u6cd5, \u6781\u5927\u4f3c\u7136\u4f30\u8ba1\u5c31\u53ef\u4ee5\u6c42\u51fa\u53c2\u6570 $\\theta$, \u5f97\u5230\u6982\u7387\u5206\u5e03, \u6700\u540e\u91c7\u6837\u751f\u6210\u56fe\u7247\u3002<\/p>\n<p>\u6781\u5927\u4f3c\u7136\u4f30\u8ba1\u5b58\u5728\u7684\u4e00\u4e2a\u5173\u952e\u7684\u95ee\u9898\uff1a<strong>\u8be5\u65b9\u6cd5\u662f\u6709\u5047\u8bbe\u5b58\u5728\u7684\uff0c\u5047\u8bbe\u4e86 p(x) \u670d\u4ece\u67d0\u79cd\u5206\u5e03<\/strong>\u3002\u5982\u679c\u5047\u8bbe\u5206\u5e03\u4e0e\u6570\u636e\u7684\u771f\u5b9e\u5206\u5e03\u4e0d\u4e00\u81f4\uff0cMLE\u7684\u6548\u679c\u53ef\u80fd\u4f1a\u53d7\u5230\u5f71\u54cd\u3002<\/p>\n<p>\u66f4\u91cd\u8981\u7684\u662f\uff0c<strong>\u5206\u5e03\u7684\u9009\u62e9\u662f\u9700\u8981\u9886\u57df\u77e5\u8bc6\u6216\u5148\u9a8c\u7684<\/strong>\uff0c\u9700\u8981\u5bf9\u751f\u6210\u8fc7\u7a0b\u5f88\u4e86\u89e3\uff0c\u5426\u5219\u5982\u679c\u9009\u62e9\u7684\u5206\u5e03\u548c\u771f\u5b9e\u5206\u5e03\u4e0d\u4e00\u81f4\uff0c\u90a3\u4e48\u7ed3\u679c\u53ef\u80fd\u5f88\u5dee\u3002\u73b0\u5b9e\u4e16\u754c\u4e2d\u7684\u95ee\u9898\u5f80\u5f80\u975e\u5e38\u590d\u6742\uff0c\u901a\u5e38\u96be\u4ee5\u5b8c\u5168\u4e86\u89e3\u5176\u751f\u6210\u8fc7\u7a0b\uff0c\u4e5f\u96be\u4ee5\u627e\u5230\u80fd\u51c6\u786e\u63cf\u8ff0\u8be5\u8fc7\u7a0b\u7684\u6982\u7387\u5206\u5e03\u5f62\u5f0f\u3002\u56e0\u6b64\uff0c\u5047\u8bbe\u7684\u5206\u5e03\u53ef\u80fd\u4e0e\u771f\u5b9e\u5206\u5e03\u5b58\u5728\u5dee\u5f02\uff0c\u751a\u81f3\u5b8c\u5168\u9519\u8bef\u3002<\/p>\n<p>\u5982\u679c\u5b9e\u5728\u6ca1\u529e\u6cd5\u786e\u5b9a\u5047\u5b9a\u5206\u5e03\uff0c\u5b9e\u9645\u4e0a\u4e5f\u53ef\u4ee5\u901a\u8fc7\u4e0d\u540c\u7684\u6a21\u578b\u6765\u62df\u5408$f$\u3002\u6839\u636e\u5177\u4f53\u7684\u5e94\u7528\u573a\u666f\u548c\u6570\u636e\u7684\u7279\u70b9\uff0c\u6211\u4eec\u53ef\u4ee5\u9009\u62e9\u4ee5\u4e0b\u51e0\u79cd\u5e38\u89c1\u7684\u6a21\u578b\u6765\u63cf\u8ff0 $P(x ; \\theta)$\u3002<\/p>\n<ul>\n<li>\u7ebf\u6027\u6a21\u578b<\/li>\n<li>\u795e\u7ecf\u7f51\u7edc<\/li>\n<li>\u6df7\u5408\u6a21\u578b<\/li>\n<li>\u7b49\u7b49...<\/li>\n<\/ul>\n<p>\u5373\u4fbf\u4f7f\u7528\u6a21\u578b\u5c1d\u8bd5\u8fdb\u884c\u62df\u5408\uff0c\u5728\u6ca1\u6709\u597d\u7684\u5f52\u7eb3\u504f\u7f6e\u60c5\u51b5\u4e0b\uff0c\u6548\u679c\u4f9d\u7136\u5dee\u5f3a\u4eba\u610f\u3002\u4e0b\u9762\uff0c\u5f15\u51fa\u9690\u53d8\u91cf\u6a21\u578b\uff1a<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=bxK5SpQHsVmH&format=png&color=000000\" style=\"height:50px;display:inline\"> \u9690\u53d8\u91cf\u6a21\u578b\uff08Hidden variables model\uff09<\/h3>\n<hr \/>\n<p>\u9690\u53d8\u91cf\u53ef\u4ee5\u4f5c\u4e3a\u89e3\u51b3\u56fe\u7247\u751f\u6210\u8fd9\u4e00\u56f0\u96be\u95ee\u9898\u7684\u8df3\u677f\uff0c\u8fd9\u4e2a\u601d\u8def\u5728\u6570\u5b66\u4e2d\u975e\u5e38\u5e38\u89c1\uff0c\u4f8b\u5982\uff1a\u6211\u4eec\u60f3\u76f4\u63a5\u6839\u636e\u53d8\u91cf\u201ca\u201d\uff0c\u6c42\u89e3\u7ed3\u679c\u201cb\u201d\u975e\u5e38\u56f0\u96be\uff0c\u800c\u7531\u201ca\u201d\u6c42\u89e3\u201cc\u201d\u548c\u7531\u201cc\u201d\u6c42\u89e3\u201cb\u201d\u90fd\u5f88\u7b80\u5355\uff0c\u90a3\u4e48\u53ef\u4ee5\u9009\u62e9\u7ed5\u5f00\u6700\u96be\u7684\u90e8\u5206\uff0c\u800c\u9009\u62e9\u201ca \u2192c \u2192b\u201d\u7684\u6c42\u89e3\u65b9\u6cd5\u3002<\/p>\n<p>\u5728\u56fe\u50cf\u751f\u6210\u4e3a\u4f8b\uff1a\u5982\u679c\u6839\u636e\u56fe\u7247\u6837\u672c\u76f4\u63a5\u6c42\u89e3\u6570\u636e\u5206\u5e03 $p(x)$ \u5f88\u96be\uff0c\u90a3\u4e48\u53ef\u4ee5\u901a\u8fc7\u9690\u53d8\u91cf\u5b9e\u73b0\uff0c\u4f8b\u5982\u8003\u8651\u624b\u5199\u4f53\u6570\u5b57\u4f8b\u5b50\uff0c\u4e00\u822c\u5728\u5199\u6570\u5b57\u7684\u65f6\u5019\u4f1a\u9996\u5148\u60f3\u5230\u8981\u5199\u54ea\u4e2a\u6570\u5b57\uff0c\u540c\u65f6\u8111\u5b50\u91cc\u60f3\u8c61\u5b83\u7684\u6837\u5b50\uff0c\u7136\u540e\u624d\u662f\u5199\u4e0b\u6765\u5f62\u6210\u56fe\u50cf\u3002<\/p>\n<p>\u8fd9\u4e2a\u8fc7\u7a0b\u53ef\u4ee5\u603b\u7ed3\u6210\u4e24\u4e2a\u9636\u6bb5\uff1a<\/p>\n<ol>\n<li>\u5148\u51b3\u5b9a\u6570\u5b57\u53ca\u5176\u5b83\u5f71\u54cd\u56e0\u7d20\uff0c\u7528\u9690\u53d8\u91cf $z$ \u6765\u8868\u793a\uff1b<\/li>\n<li>\u518d\u6839\u636e\u9690\u53d8\u91cf $z$ \u751f\u6210\u6570\u5b57\u56fe\u50cf\u3002\u8fd9\u5c31\u662f\u9690\u53d8\u91cf\u6a21\u578b\uff0c\u7528\u6570\u5b66\u63cf\u8ff0\u4e3a\uff1a<br \/>\n$$<br \/>\nP(X)=\\int P(X \\mid z ; \\theta) P(z) \\mathrm{d} z<br \/>\n$$<\/li>\n<\/ol>\n<p>\u901a\u5e38\u5047\u5b9a $z$ \u670d\u4ece\u6b63\u6001\u5206\u5e03 $z \\sim \\mathcal{N}(0, I), P(X \\mid z ; \\theta)$ \u53ef\u4ee5\u6362\u6210 $f(z ; \\theta)$, \u5373\u7528\u4e00\u4e2a\u53c2\u6570\u4e3a $\\theta$ \u7684\u51fd\u6570\u53bb\u8ba1\u7b97\u6837\u672c $X$ \u7684\u6982\u7387\u5206\u5e03 $P(X \\mid z ; \\theta)$ \u3002\u8fd9\u91cc\u91c7\u7528\u6761\u4ef6\u5206\u5e03\u5f62\u5f0f\u662f\u56e0\u4e3a\u5b83\u53ef\u4ee5\u663e\u5f0f\u7684\u8868\u660e $X$ \u4f9d\u8d56 $z$ \u751f\u6210\u3002<br \/>\n$$<br \/>\nP(X)=\\int f(z ; \\theta) P(z) \\mathrm{d} z<br \/>\n$$<\/p>\n<p>\u9690\u53d8\u91cf\u6a21\u578b\u80cc\u540e\u7684\u5173\u952e\u601d\u60f3\u662f: <strong>\u4efb\u4f55\u4e00\u4e2a\u6982\u7387\u5206\u5e03\u7ecf\u8fc7\u4e00\u4e2a\u8db3\u591f\u590d\u6742\u7684\u51fd\u6570\u540e\u53ef\u4ee5\u6620\u5c04\u5230\u4efb\u610f\u6982\u7387\u5206\u5e03<\/strong>\u3002<\/p>\n<p>\u5982\u793a\u4f8b\u4e2d, $z$ \u670d\u4ece\u6807\u51c6\u9ad8\u65af\u5206\u5e03, \u6765\u6837\u540e\u7ecf\u8fc7\u51fd\u6570 $f(z ; \\theta)$ \u7684\u53d8\u6362\u540e\u53ef\u4ee5\u53d8\u6210\u624b\u5199\u4f53\u6570\u5b57\u7684\u771f\u5b9e\u5206\u5e03 $P(X)$ \u3002<\/p>\n<p>\u901a\u8fc7\u9690\u53d8\u91cf\u6a21\u578b, <strong>\u6781\u5927\u4f3c\u7136\u4f30\u8ba1\u7684\u95ee\u9898\u5df2\u7ecf\u88ab\u7ed5\u8fc7\u53bb\u4e86<\/strong>\uff0c\u4e0d\u518d\u9700\u8981\u6307\u5b9a\u590d\u6742\u7684\u6982\u7387\u5206\u5e03\u5f62\u5f0f\u4e5f\u4e0d\u6015\u51fa\u73b0\u5206\u5e03\u4e0d\u4e00\u81f4\u7684\u60c5\u51b5\u3002\u53ea\u9700\u8981\u6c42\u89e3\u51fd\u6570 $f(z ; \\theta)$ \u5373\u53ef(\u4e5f\u5c31\u662f\u6c42\u89e3\u6a21\u578b)<\/p>\n<p>\u8fd9\u4e2a\u51fd\u6570\u770b\u4f3c\u5f88\u96be\u6c42\u89e3, \u5b9e\u9645\u4e0a\u5c31\u662f\u5f88\u96be\u6c42\u89e3, \u4e0d\u8fc7\u6211\u4eec\u6709\u795e\u7ecf\u7f51\u7edc\u8fd9\u4e00\u5229\u5668\uff0c<strong>\u6df1\u5ea6\u5b66\u4e60\u6700\u6709\u9b45\u529b\u7684\u4e00\u70b9\u5c31\u662f\u62df\u5408\u80fd\u529b, \u4f46\u51e1\u76f4\u63a5\u6c42\u89e3\u5f88\u56f0\u96be\u7684\u95ee\u9898, \u90fd\u53ef\u4ee5\u4ea4\u7ed9\u795e\u7ecf\u7f51\u7edc\u8fdb\u884c\u62df\u5408\u3002<\/strong><\/p>\n<p>\u4e0b\u9762\u91c7\u7528\u4e00\u4e9b\u4f18\u5316\u7b97\u6cd5\u8fdb\u884c\u6c42\u89e3\u5373\u53ef\uff0c\u4f8b\u5982\u68af\u5ea6\u4e0b\u964d\uff0c\u725b\u987f\u6cd5\u7b49\u7b49<\/p>\n<p>\u5173\u4e8e $\\theta$ \u7684\u68af\u5ea6\u4e3a:<br \/>\n$$<br \/>\n\\nabla_\\theta L(\\theta ; X)=-\\sum_{i=1}^n \\frac{\\int \\nabla_\\theta p\\left(x_i \\mid z ; \\theta\\right) p(z) \\mathrm{d} z}{\\int p\\left(x_i \\mid z ; \\theta\\right) p(z) \\mathrm{d} z}<br \/>\n$$<br \/>\n\u6839\u636e\u4e0a\u8ff0\u7684\u68af\u5ea6\u516c\u5f0f\u5c31\u53ef\u4ee5\u4f18\u5316\u53c2\u6570, \u5f97\u5230\u6700\u540e\u7684\u9690\u53d8\u91cf\u6a21\u578b\u3002<\/p>\n<p>\u4f46\u9690\u53d8\u91cf\u6a21\u578b\u5b58\u5728\u95ee\u9898: \u8ba1\u7b97 $\\nabla_\\theta L(\\theta ; X)$ \u7684\u8fc7\u7a0b\u4e2d\u9700\u8981\u8ba1\u7b97\u5206\u5b50\u548c\u5206\u6bcd\u7684\u79ef\u5206, \u4e3a\u4e86\u4f7f\u5f97 $z$ \u80fd\u8868\u8fbe\u66f4\u591a\u7684\u4fe1\u606f\uff0c\u901a\u5e38\u5047\u8bbe $z$ \u662f\u4e00\u4e2a\u8fde\u7eed\u7684\u968f\u673a\u53d8\u91cf\u3002\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b, <strong>\u4e00\u822c\u65e0\u6cd5\u76f4\u63a5\u6c42\u89e3\u51c6\u786e\u503c\uff0c\u4f1a\u5b58\u5728\u8ba1\u7b97\u56f0\u96be\u7684\u95ee\u9898<\/strong>\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=PJel6Hfekiea&format=png&color=000000\" style=\"height:50px;display:inline\"> \u8499\u7279\u5361\u6d1b\u91c7\u6837\uff08Monte Carlo Sampling\uff09<\/h3>\n<hr \/>\n<p><strong>\u8499\u7279\u5361\u6d1b\u91c7\u6837\u7684\u57fa\u672c\u6b65\u9aa4\u5982\u4e0b<\/strong><\/p>\n<p>\u9690\u53d8\u91cf\u6a21\u578b\u5728\u8ba1\u7b97\u68af\u5ea6\u65f6\u5b58\u5728\u79ef\u5206\u96be\u8ba1\u7b97\u7684\u95ee\u9898\u3002\u9488\u5bf9\u6c42\u79ef\u5206\u95ee\u9898\uff0c\u5f88\u96be\u8ba1\u7b97\u51c6\u786e\u503c\uff0c\u56e0\u6b64\u901a\u5e38\u91c7\u7528\u8499\u7279\u5361\u7f57\u91c7\u6837\u53bb\u8fd1\u4f3c\u6c42\u89e3<\/p>\n<p>\u539f\u6765\u7684\u79ef\u5206\u53ef\u4ee5\u5199\u6210\u671f\u671b\u7684\u5f62\u5f0f $\\int p(x \\mid z ; \\theta) p(z) \\mathrm{d} z=\\mathbb{E}_{z \\sim p(z)}[p(x \\mid z ; \\theta)]$, \u7136\u540e\u5229\u7528\u671f\u671b\u6cd5\u6c42\u79ef\u5206, \u6b65\u9aa4\u5982\u4e0b\u3002<\/p>\n<p>(1) \u4ece $p(z)$ \u4e2d\u591a\u6b21\u91c7\u6837 $z_1, z_2, \\cdots, z_m$ \uff1b<\/p>\n<p>(2) \u6839\u636e $p(x \\mid z ; \\theta)$ \u8ba1\u7b97 $x_1, x_2, \\cdots, x_m$<\/p>\n<p>(3) \u6c42 $x$ \u7684\u5747\u503c\u3002\u7528\u6570\u5b66\u8868\u8fbe\u4e3a:<\/p>\n<p>$$<br \/>\n\\int p(x \\mid z ; \\theta) p(z) \\mathrm{d} z=\\mathbb{E}_{z \\sim p(z)}[p(x \\mid z ; \\theta)] \\approx \\frac{1}{m} \\sum_{j=1}^m p\\left(x_j \\mid z_j ; \\theta\\right)<br \/>\n$$<\/p>\n<p>\u901a\u8fc7\u5bf9 $z$ \u591a\u6b21\u91c7\u6837\uff0c\u53ef\u4ee5\u8ba1\u7b97 $\\nabla_\\theta L(\\theta ; X)$ \u7684\u8fd1\u4f3c\u503c\u3002<\/p>\n<p><strong>\u7b80\u5355\u6765\u8bf4, \u8499\u7279\u5361\u6d1b\u91c7\u6837\u5c31\u662f\u901a\u8fc7\u6837\u672c\u7684\u5747\u503c\u6765\u8fd1\u4f3c\u603b\u4f53\u7684\u79ef\u5206\u3002<\/strong><\/p>\n<p>\u8499\u7279\u5361\u7f57\u91c7\u6837\u5b58\u5728\u7684\u95ee\u9898\u662f\uff1a\u91c7\u6837\u6b21\u6570\u4e00\u822c\u9700\u8981\u5f88\u5927\u3002\u4e3b\u8981\u662f\u7531\u4e24\u4e2a\u56e0\u7d20\u5f15\u8d77\u7684\uff1a<\/p>\n<p>\uff081\uff09<strong>\u9ad8\u7ef4\u5ea6<\/strong>\uff1a\u5bf9\u4e8e\u9ad8\u7ef4\u95ee\u9898\uff0c\u7531\u4e8e\u201c\u7ef4\u5ea6\u7684\u8bc5\u5492\u201d\uff0c\u53ef\u80fd\u9700\u8981\u6307\u6570\u7ea7\u7684\u91c7\u6837\u6570\u624d\u80fd\u5728\u6240\u6709\u7ef4\u5ea6\u4e0a\u83b7\u5f97\u8db3\u591f\u7684\u8986\u76d6\u3002\u8fd9\u662f\u56e0\u4e3a\u5728\u9ad8\u7ef4\u7a7a\u95f4\u4e2d\uff0c\u5927\u90e8\u5206\u7684\u4f53\u79ef\u90fd\u5728\u9760\u8fd1\u8fb9\u754c\u7684\u533a\u57df\uff0c\u6240\u4ee5\u9700\u8981\u5927\u91cf\u7684\u6837\u672c\u624d\u80fd\u7cbe\u786e\u5730\u4f30\u8ba1\u6574\u4e2a\u7a7a\u95f4\u7684\u6027\u8d28\u3002<\/p>\n<p>\uff082\uff09<strong>\u7a00\u758f\u533a\u57df<\/strong>\uff1a\u5982\u679c\u611f\u5174\u8da3\u7684\u5206\u5e03\u5728\u67d0\u4e9b\u533a\u57df\u4e2d\u975e\u5e38\u7a00\u758f\uff0c\u90a3\u4e48\u5927\u591a\u6570\u7684\u8499\u7279\u5361\u7f57\u6837\u672c\u53ef\u80fd\u90fd\u4f1a\u843d\u5165\u4e0d\u5173\u5fc3\u7684\u533a\u57df\uff0c\u800c\u771f\u6b63\u5173\u5fc3\u7684\u533a\u57df\u53ef\u80fd\u4f1a\u88ab\u4e25\u91cd\u5730\u6b20\u91c7\u6837\u3002\u8fd9\u4f1a\u5bfc\u81f4\u4f30\u8ba1\u7ed3\u679c\u4e25\u91cd\u504f\u79bb\u771f\u5b9e\u503c\u3002<\/p>\n<p>\u89e3\u51b3\u8fd9\u4e24\u4e2a\u95ee\u9898\u7684\u65b9\u6cd5\u53ef\u4ee5\u7f29\u5c0f $z$ \u7684\u53d6\u503c\u7a7a\u95f4\u3002\u7f29\u5c0f $p(z)$ \u7684\u65b9\u5dee $\\sigma^2$, \u90a3\u4e48 $z$ \u7684\u91c7\u6837\u8303\u56f4\u4f1a\u7f29\u5c0f, \u91c7\u6837\u7684\u6b21\u6570 $m$ \u4e5f\u4e0d\u9700\u8981\u90a3\u4e48\u5927\u3002\u540c\u65f6, \u4e5f\u53ef\u80fd\u628a\u751f\u6210\u574f\u6837\u672c\u7684 $z$ \u6392\u9664, \u751f\u6210\u66f4\u50cf\u771f\u5b9e\u6837\u672c\u7684\u56fe\u50cf\u3002\u8fd9\u5176\u5b9e\u5c31\u5f15\u51fa\u4e86\u4e0b\u9762\u8981\u4ecb\u7ecd\u7684\u53d8\u5206\u81ea\u7f16\u7801\u5668 VAE \u7684\u539f\u7406\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=KKRe9LOjJRZ8&format=png&color=000000\" style=\"height:50px;display:inline\"> \u53d8\u5206\u81ea\u7f16\u7801\u5668\uff08Variational Autoencoder\uff09<\/h3>\n<hr \/>\n<p><strong>\u7ee7\u7eed\u4e0a\u4e00\u8282\u7684\u95ee\u9898, \u600e\u4e48\u80fd\u7f29\u5c0f $z$ \u7684\u53d6\u503c\u7a7a\u95f4\u5462?<\/strong><\/p>\n<ul>\n<li>\u539f\u6765 $z$ \u4ece\u5148\u9a8c\u6982\u7387\u5206\u5e03 $p(z)$ \u4e2d\u91c7\u6837,\u73b0\u5728\u53ef\u4ee5\u8003\u8651\u4ece $z$ \u7684\u540e\u9a8c\u6982\u7387\u5206\u5e03 $p(z \\mid X)$ \u4e2d\u91c7\u6837\u3002<\/li>\n<li>\u5177\u4f53\u6765\u8bf4, \u7ed9\u5b9a\u4e00\u4e2a\u771f\u5b9e\u6837\u672c $X$, \u5047\u8bbe\u5b58\u5728\u4e00\u4e2a\u4e13\u5c5e\u4e8e $X$ \u7684\u5206\u5e03 $p(z \\mid X)$ (\u540e\u9a8c\u5206\u5e03)\uff0c\u5e76\u8fdb\u4e00\u6b65\u5047\u8bbe\u8fd9\u4e2a\u5206\u5e03\u662f\u72ec\u7acb\u7684\u3001\u591a\u5143\u7684\u6b63\u6001\u5206\u5e03\u3002<\/li>\n<\/ul>\n<p><strong>\u5982\u4f55\u7406\u89e3\u540e\u9a8c\u6982\u7387 $p(z \\mid X)$ \u4f1a\u6bd4\u5148\u9a8c\u6982\u7387 $p(z)$ \u66f4\u597d\u5462?<\/strong> <\/p>\n<ul>\n<li>\n<p>\u8fd9\u662f\u56e0\u4e3a\u540e\u9a8c\u6982\u7387\u5305\u542b\u4e86\u66f4\u591a\u7684\u4fe1\u606f\u3002\u5148\u9a8c\u6982\u7387\u53ea\u53cd\u6620\u4e86\u5728\u6ca1\u6709\u89c2\u5bdf\u6570\u636e\u4e4b\u524d\u5bf9\u9690\u53d8\u91cf\u7684\u77e5\u8bc6\u6216\u8005\u5047\u8bbe\uff0c\u5b83\u901a\u5e38\u88ab\u8bbe\u7f6e\u4e3a\u4e00\u79cd\u7b80\u5355\u7684\u5206\u5e03\uff0c\u5982\u9ad8\u65af\u5206\u5e03\u6216\u8005\u5747\u5300\u5206\u5e03\u3002\u800c\u540e\u9a8c\u6982\u7387\u5219\u662f\u5728\u89c2\u5bdf\u5230\u6570\u636e\u4e4b\u540e\uff0c\u5bf9\u9690\u53d8\u91cf\u7684\u6700\u65b0\u8ba4\u8bc6\uff0c\u5b83\u5305\u542b\u4e86\u6570\u636e\u7684\u4fe1\u606f\u3002<\/p>\n<\/li>\n<li>\n<p>\u4f8b\u5982\uff0c\u5047\u8bbe\u6211\u4eec\u7684\u4efb\u52a1\u662f\u5bf9\u4eba\u8138\u56fe\u7247\u8fdb\u884c\u5efa\u6a21\uff0c\u9690\u53d8\u91cf\u53ef\u80fd\u4ee3\u8868\u4e00\u4e9b\u4eba\u8138\u7684\u7279\u6027\uff0c\u5982\u6027\u522b\u3001\u5e74\u9f84\u7b49\u3002\u5728\u6ca1\u6709\u770b\u5230\u4efb\u4f55\u56fe\u7247\u7684\u60c5\u51b5\u4e0b\uff0c\u53ef\u80fd\u5047\u8bbe\u6240\u6709\u7684\u6027\u522b\u548c\u5e74\u9f84\u90fd\u662f\u7b49\u53ef\u80fd\u7684\uff0c\u8fd9\u5c31\u662f\u5148\u9a8c\u6982\u7387\u3002\u4f46\u662f\u5f53\u770b\u5230\u4e00\u4e9b\u56fe\u7247\u4e4b\u540e\uff0c\u53ef\u80fd\u4f1a\u53d1\u73b0\u5b9e\u9645\u4e0a\u67d0\u4e9b\u6027\u522b\u6216\u8005\u5e74\u9f84\u7684\u4eba\u8138\u56fe\u7247\u66f4\u5e38\u89c1\uff0c\u8fd9\u5c31\u662f\u540e\u9a8c\u6982\u7387\u3002<\/p>\n<\/li>\n<li>\n<p>\u56e0\u6b64\uff0c\u5f53\u6211\u4eec\u8bf4\u540e\u9a8c\u6982\u7387\u6bd4\u5148\u9a8c\u6982\u7387\u66f4\u597d\uff0c\u5176\u5b9e\u662f\u8bf4\uff0c\u540e\u9a8c\u6982\u7387\u5305\u542b\u4e86\u66f4\u591a\u7684\u6765\u81ea\u4e8e\u6570\u636e\u7684\u4fe1\u606f\uff0c\u80fd\u591f\u66f4\u51c6\u786e\u5730\u53cd\u6620\u771f\u5b9e\u4e16\u754c\u7684\u60c5\u51b5\u3002\u5728\u53d8\u5206\u81ea\u7f16\u7801\u5668\u4e2d\uff0c\u7f16\u7801\u5668\u7684\u76ee\u6807\u5c31\u662f\u5b66\u4e60\u8868\u793a\u540e\u9a8c\u6982\u7387\u5206\u5e03\u3002<\/p>\n<\/li>\n<\/ul>\n<p><strong>\u4ece\u6570\u5b66\u89d2\u5ea6\u51fa\u53cb, \u5982\u4f55\u6c42\u51fa\u540e\u9a8c\u6982\u7387\u5206\u5e03 $p(z \\mid X)$ \u5462?<\/strong><\/p>\n<ul>\n<li>\n<p>\u6c42\u9690\u53d8\u91cf\u7684\u540e\u9a8c\u5206\u5e03\u662f\u53d8\u5206\u63a8\u65ad\u7684\u4e00\u4e2a\u6838\u5fc3\u95ee\u9898\u3002<\/p>\n<\/li>\n<li>\n<p>\u4e00\u822c\u662f\u65e0\u6cd5\u51c6\u786e\u6c42\u51fa\u540e\u9a8c\u5206\u5e03\u7684, \u4f46\u662f\u53d8\u5206\u63a8\u65ad\u53ef\u4ee5\u7528\u53e6\u4e00\u4e2a\u5206\u5e03 $q_\\theta(z \\mid X)$ \u8fd1\u4f3c\u4f30\u8ba1 $p(z \\mid X)$, \u7136\u540e\u4ece  $q_\\theta(z \\mid X)$  \u4e2d\u91c7\u6837\u6765\u8fd1\u4f3c\u4ece $p(z \\mid X)$ \u4e2d\u91c7\u6837\u3002<\/p>\n<\/li>\n<li>\n<p>\u8fd9\u79cd\u65b9\u6cd5\u662f\u7528\u4e00\u4e2a\u51fd\u6570\u8fd1\u4f3c\u53e6\u4e00\u4e2a\u51fd\u6570\uff0c\u5176\u5b9e\u5c31\u662f\u7528\u795e\u7ecf\u7f51\u7edc\u6765\u8fd1\u4f3c\u6982\u7387\u5206\u5e03\u53c2\u6570\u3002<\/p>\n<\/li>\n<\/ul>\n<p>\u5728\u53d8\u5206\u63a8\u65ad\u7684\u80cc\u666f\u4e0b, \u6211\u4eec\u6709\u4e00\u4e2a\u590d\u6742\u7684\u6982\u7387\u5206\u5e03, \u901a\u5e38\u662f\u540e\u9a8c\u6982\u7387\u5206\u5e03 $p(z \\mid X)$, \u8fd9\u91cc $z$ \u662f\u9690\u53d8\u91cf, $X$ \u662f\u89c2\u5bdf\u5230\u7684\u6570\u636e\u3002\u76ee\u6807\u662f\u627e\u5230\u4e00\u4e2a\u76f8\u5bf9\u7b80\u5355\u7684\u5206\u5e03 (\u6bd4\u5982\u9ad8\u65af\u5206\u5e03), \u79f0\u5176\u4e3a $q_\\theta(z \\mid X)$, \u7528\u5b83\u6765\u8fd1\u4f3c\u771f\u5b9e\u7684\u540e\u9a8c\u5206\u5e03\u3002\u5176\u4e2d\uff0c $\\theta$ \u8868\u793a\u5206\u5e03\u7684\u53c2\u6570, \u9700\u8981\u627e\u5230\u5408\u9002\u7684 $\\theta$ \u6765\u6700\u5927\u5316\u8fd9\u79cd\u8fd1\u4f3c\u7684\u51c6\u786e\u6027\u3002<\/p>\n<p><strong>\u90a3\u4e48, \u5982\u4f55\u5ea6\u91cf\u8fd9\u79cd \u201c\u8fd1\u4f3c\u201d\u7684\u51c6\u786e\u6027\u5462?<\/strong><\/p>\n<ul>\n<li>\u8fd9\u5c31\u9700\u8981\u7528\u5230 KL \u6563\u5ea6, \u5b83\u662f\u4e00\u79cd\u8861\u91cf\u4e24\u4e2a\u6982\u7387\u5206\u5e03\u4e4b\u95f4 \u201c\u8ddd\u79bb\u201d \u7684\u65b9\u6cd5\u3002<\/li>\n<li>\u76ee\u6807\u5c31\u662f\u627e\u5230\u53c2\u6570 $\\theta$, \u4f7f\u5f97 $q_\\theta(z \\mid X)$ \u548c $p(z \\mid X)$ \u4e4b\u95f4\u7684 KL\u6563\u5ea6\u6700\u5c0f\u3002<\/li>\n<li>\u7136\u540e, \u8fd9\u4e2a\u6700\u5c0f\u5316\u95ee\u9898\u53ef\u4ee5\u901a\u8fc7\u68af\u5ea6\u4e0b\u964d\u7b49\u4f18\u5316\u7b97\u6cd5\u6765\u6c42\u89e3<\/li>\n<\/ul>\n<p><strong>\u53d8\u5206\u63a8\u65ad\u5c31\u662f\u4e00\u79cd\u7528\u4f18\u5316\u7684\u65b9\u5f0f\u6765\u903c\u8fd1\u590d\u6742\u7684\u6982\u7387\u5206\u5e03\u7684\u65b9\u6cd5<\/strong>\u3002\u901a\u8fc7\u5728\u51fd\u6570\u7a7a\u95f4\u4e2d\u8fdb\u884c\u68af\u5ea6\u4e0b\u964d\uff0c\u627e\u5230\u4e00\u4e2a\u53ef\u4ee5\u7528\u6765\u8fd1\u4f3c\u771f\u5b9e\u5206\u5e03\u7684\u7b80\u5355\u5206\u5e03\u3002\u4f46\u662f\uff0c\u8fd8\u6709\u4e00\u4e2a\u5f88\u5927\u7684\u95ee\u9898\uff1a<strong>$p(z \\mid X)$ \u662f\u672a\u77e5\u7684<\/strong>, \u6240\u4ee5\u65e0\u6cd5\u76f4\u63a5\u8ba1\u7b97 $\\mathrm{KL}$ \u6563\u5ea6\u3002<\/p>\n<p>\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898\u7684\u65b9\u6cd5\u662f\u901a\u8fc7\u5f15\u5165\u4e00\u4e2a\u53eb\u505a<strong>\u8bc1\u636e\u4e0b\u754c(Evidence Lower Bound, ELBO)<\/strong> \u7684\u91cf\u3002ELBO \u662f\u6a21\u578b\u5bf9\u6570\u4f3c\u7136\u7684\u4e00\u4e2a\u4e0b\u754c, \u5b83\u4e0e KL \u6563\u5ea6\u7684\u548c\u662f\u4e00\u4e2a\u5e38\u6570, \u8fd9\u4e2a\u5e38\u6570\u5c31\u662f\u89c2\u6d4b\u6570\u636e\u7684\u5bf9\u6570\u4f3c\u7136\u3002\u56e0\u6b64, \u6700\u5927\u5316 ELBO \u7b49\u4ef7\u4e8e\u6700\u5c0f\u5316 KL \u6563\u5ea6\uff0c\u5176\u89e3\u91ca\u8be6\u89c1\u6982\u7387\u7edf\u8ba1\u6559\u7a0b\u3002<\/p>\n<p>ELBO \u53ef\u4ee5\u5199\u6210\u4ee5\u4e0b\u5f62\u5f0f:<br \/>\n$$<br \/>\n\\mathrm{ELBO}=E_{q_\\theta(z \\mid X)}[\\log p(X \\mid z)]-K L\\left(q_\\theta(z \\mid X) | p(z)\\right)<br \/>\n$$<\/p>\n<ul>\n<li>\n<p>\u7b2c\u4e00\u9879 $E_{q_\\theta(z \\mid X)}[\\log p(X \\mid z)]$ \u662f\u91cd\u6784\u8bef\u5dee, \u4ee3\u8868\u4e86\u751f\u6210\u7684\u6570\u636e\u4e0e\u5b9e\u9645\u6570\u636e\u7684\u76f8\u4f3c\u7a0b\u5ea6, \u5177\u4f53\u6765\u8bf4, $X \\sim q_\\theta(z \\mid X)$\u8868\u793a\u6709\u4e00\u4e2a $X$ \u53ef\u4ee5\u6839\u636e\u5206\u5e03 $q_\\theta(z \\mid X)$ \u91c7\u6837\u4e00\u4e2a $z$, \u8fd9\u4e2a\u8fc7\u7a0b\u53ef\u4ee5\u7406\u89e3\u4e3a\u628a $X$ \u7f16\u7801\u6210 $z$, \u6b64\u8fc7\u7a0b\u88ab\u79f0\u4e3a VAE \u6a21\u578b\u7684\u7f16\u7801\u5668\u3002 $p(X \\mid z)$ \u8868\u793a\u6839\u636e $z$ \u751f\u6210\u8f93\u51fa\u7ed3\u679c,\u6b64\u8fc7\u7a0b\u88ab\u79f0\u4e3a VAE \u6a21\u578b\u7684\u89e3\u7801\u5668\u3002\u6574\u4f53\u8868\u793a\u7ed9\u5b9a $x$ \u7f16\u7801\u6210 $z$ \u518d\u91cd\u6784\u5f97\u5230\u8f93\u51fa\u7ed3\u679c, \u8fd9\u4e2a\u8fc7\u7a0b\u7684\u671f\u671b, \u88ab\u79f0\u4e3a\u91cd\u6784\u8bef\u5dee\u3002\u5982\u679c\u8fd9\u4e2a\u671f\u671b\u5f88\u5927, \u8868\u660e\u5f97\u5230\u7684 $z$ \u662f $X$ \u7684\u4e00\u4e2a\u597d\u7684\u8868\u793a, \u80fd\u591f\u62bd\u53d6 $X$ \u8db3\u591f\u591a\u7684\u4fe1\u606f\u6765\u91cd\u6784\u8f93\u51fa\u7ed3\u679c, \u8ba9\u5b83\u4e0e $X$ \u5c3d\u53ef\u80fd\u76f8\u4f3c\u3002<\/p>\n<\/li>\n<li>\n<p>\u7b2c\u4e8c\u9879 $\\mathrm{KL}\\left(q_\\theta(z \\mid X) | p(z)\\right)$ \u662f $q_\\theta(z \\mid X)$ \u548c\u5148\u9a8c\u5206\u5e03 $p(z)$ \u7684 KL \u6563\u5ea6, \u4ee3\u8868\u4e86\u9690\u53d8\u91cf\u7684\u5206\u5e03\u504f\u79bb\u5148\u9a8c\u5206\u5e03\u7684\u7a0b\u5ea6\u3002<\/p>\n<\/li>\n<li>\n<p>\u53ef\u4ee5\u770b\u5230, \u8ba1\u7b97 ELBO \u5e76\u4e0d\u9700\u8981\u77e5\u9053\u540e\u9a8c\u5206\u5e03 $p(z \\mid X)$ \u7684\u5177\u4f53\u5f62\u5f0f, \u53ea\u9700\u8981\u77e5\u9053\u6570\u636e\u7684\u751f\u6210\u6a21\u578b $p(X \\mid z)$\u548c\u9690\u53d8\u91cf\u7684\u5148\u9a8c\u5206\u5e03 $p(z)$, \u800c\u8fd9\u4e24\u8005\u901a\u5e38\u90fd\u662f\u53ef\u4ee5\u8bbe\u5b9a\u7684, \u6240\u4ee5\u53ef\u4ee5\u8ba1\u7b97\u3002\u56e0\u6b64, \u53ef\u4ee5\u901a\u8fc7\u6700\u5927\u5316 ELBO \u6765\u5b9e\u73b0\u53d8\u5206\u63a8\u65ad\u3002<\/p>\n<\/li>\n<\/ul>\n<p>\u5b9e\u9645\u4e0a\uff0c<strong>ELBO\u5c31\u7b49\u4ef7\u4e8eVAE\u7684\u635f\u5931\u51fd\u6570<\/strong>\uff0cVAE\u7684\u8bad\u7ec3\u76ee\u6807\u5c31\u662f\u6700\u5927\u5316ELBO\uff08\u8bc1\u636e\u4e0b\u754c\uff09\u3002\u8d1fELBO\u7531\u4e24\u90e8\u5206\u7ec4\u6210\uff1a<\/p>\n<ul>\n<li>\u7b2c\u4e00\u90e8\u5206\u662f\u671f\u671b\u7684\u91cd\u6784\u8bef\u5dee\uff0c\u5b83\u8861\u91cf\u7684\u662f\u6a21\u578b\u751f\u6210\u7684\u6570\u636e\u4e0e\u771f\u5b9e\u6570\u636e\u7684\u5339\u914d\u7a0b\u5ea6\uff1b<\/li>\n<li>\u7b2c\u4e8c\u90e8\u5206\u662fKL\u6563\u5ea6\uff0c\u5b83\u8861\u91cf\u7684\u662f\u9690\u53d8\u91cf\u7684\u5206\u5e03\u504f\u79bb\u5148\u9a8c\u5206\u5e03\u7684\u7a0b\u5ea6\u3002<\/li>\n<\/ul>\n<p>\u91cd\u6784\u8bef\u5dee\u53ef\u4ee5\u7528\u5404\u79cd\u4e0d\u540c\u7684\u65b9\u5f0f\u6765\u8ba1\u7b97\uff0c\u4f8b\u5982\u4f7f\u7528\u5747\u65b9\u8bef\u5dee\u6216\u8005\u4ea4\u53c9\u71b5\u635f\u5931\u3002\u81f3\u4e8eKL\u6563\u5ea6\uff0c\u7531\u4e8e\u901a\u5e38\u5047\u8bbe\u5148\u9a8c\u5206\u5e03\u662f\u6807\u51c6\u6b63\u6001\u5206\u5e03\uff0c\u6240\u4ee5KL\u6563\u5ea6\u53ef\u4ee5\u7528\u9690\u53d8\u91cf\u7684\u5747\u503c\u548c\u65b9\u5dee\u6765\u663e\u5f0f\u8ba1\u7b97\u3002<\/p>\n<p>VAE\u7684\u635f\u5931\u51fd\u6570\u516c\u5f0f\u8868\u793a\u5982\u4e0b\uff1a<\/p>\n<p>$$<br \/>\n\\operatorname{loss}_{\\mathrm{VAE}}(\\phi, \\theta)=-\\sum_{i=1}^n E_{\\boldsymbol{z}_i \\sim q_\\phi\\left(\\boldsymbol{z}_i \\mid \\boldsymbol{x}_i\\right)}\\left[\\log p_\\theta\\left(\\boldsymbol{x}_i \\mid \\boldsymbol{z}_i\\right)\\right]-K L\\left(q_\\phi\\left(\\boldsymbol{z}_i \\mid \\boldsymbol{x}_i\\right) | p\\left(\\boldsymbol{z}_i\\right)\\right)<br \/>\n$$<\/p>\n<p><strong>Tip\uff1a\u5927\u5bb6\u6709\u65f6\u5019\u53ef\u80fd\u4f1a\u770b\u5230\u6709\u4eba\u5c06VAE\u7684\u635f\u5931\u51fd\u6570\u4e2d\u7684KL\u6563\u5ea6\u9879\u8868\u793a\u4e3a\uff1a<\/strong><br \/>\n$$<br \/>\n\\frac{1}{2} \\sum_{i=1}^l\\left(\\exp \\left(\\sigma_i\\right)-\\left(1+\\sigma_i\\right)+\\left(\\mu_i\\right)^2\\right)<br \/>\n$$<\/p>\n<p>\u8fd9\u662f\u53ef\u4ee5\u8fdb\u884c\u63a8\u5bfc\u7684\u5728\u53d8\u5206\u81ea\u7f16\u7801\u5668\u4e2d, \u901a\u5e38\u4f1a\u5047\u8bbe\u9690\u53d8\u91cf$z$\u7684\u5148\u9a8c\u5206\u5e03\u662f\u6807\u51c6\u6b63\u6001\u5206\u5e03, \u5373 $p(z)=\\mathcal{N}(0, I)$ \u3002\u8fd8\u4f1a\u5047\u8bbe\u7f16\u7801\u5668\u7ed9\u51fa\u7684\u540e\u9a8c\u5206\u5e03\u4e5f\u662f\u4e00\u4e2a\u9ad8\u65af\u5206\u5e03, \u5373 $q_\\theta(z \\mid X)=\\mathcal{N}\\left(\\mu, \\sigma^2 I\\right)$,\u5176\u4e2d $\\mu$ \u548c $\\sigma$ \u662f\u7531\u795e\u7ecf\u7f51\u7edc\u7ed9\u51fa\u7684\u3002\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b,  $q_\\theta(z \\mid X)$  \u548c $p(z)$ \u4e4b\u95f4\u7684 KL \u6563\u5ea6\u53ef\u4ee5\u663e\u5f0f\u8ba1\u7b97\u51fa\u6765\uff1a<\/p>\n<p>\u9996\u5148, \u4e24\u4e2a\u9ad8\u65af\u5206\u5e03\u4e4b\u95f4\u7684 KL\u6563\u5ea6\u7684\u516c\u5f0f\u4e3a:<br \/>\n$$<br \/>\nK L\\left(\\mathcal{N}\\left(\\mu_1, \\sigma_1^2\\right) | \\mathcal{N}\\left(\\mu_2, \\sigma_2^2\\right)\\right)=\\frac{\\left(\\mu_1-\\mu_2\\right)^2+\\sigma_1^2-\\sigma_2^2+2\\left(\\log \\sigma_2-\\log \\sigma_1\\right)}{2 \\sigma_2^2}<br \/>\n$$<\/p>\n<p>\u5728 VAE \u4e2d, \u8bbe\u5b9a\u5148\u9a8c\u5206\u5e03 $p(z)=\\mathcal{N}(0,1)$, \u5373$\\mu_2=0, \\sigma_2^2=1$, \u800c\u540e\u9a8c\u5206\u5e03 $q_\\theta(z \\mid X)=\\mathcal{N}\\left(\\mu, \\sigma^2\\right)$, \u5373 $\\mu_1=\\mu, \\sigma_1^2=\\sigma^2$, \u5c06\u8fd9\u4e9b\u4ee3\u5165\uff0c\u53ef\u4ee5\u5f97\u5230:<\/p>\n<p>\u8fd9\u5c31\u662f KL \u6563\u5ea6\u7684\u516c\u5f0f\u3002\u4f46\u662f, \u7531\u4e8e\u901a\u5e38\u4f7f\u7528 $\\log \\sigma^2$ (\u8bb0\u4f5c $\\sigma$ ) \u4f5c\u4e3a\u795e\u7ecf\u7f51\u7edc\u7684\u8f93\u51fa (\u4e3a\u4e86\u786e\u4fdd $\\sigma^2$ \u7684\u975e\u8d1f\u6027\uff09\uff0c\u53ef\u4ee5\u5bf9\u516c\u5f0f\u518d\u505a\u4e00\u4e9b\u53d8\u6362:<br \/>\n$$<br \/>\n\\mathrm{KL}\\left(q_\\theta(z \\mid X) | p(z)\\right)=\\frac{\\mu^2+\\mathrm{e}^\\sigma-1-\\sigma}{2}<br \/>\n$$<\/p>\n<p>\u8fd9\u5c31\u662f VAE \u8bba\u6587\u4e2d\u6240\u7ed9\u51fa\u7684 KL \u6563\u5ea6\u9879\u7684\u516c\u5f0f\u3002\u5982\u679c\u6709 $l$ \u4e2a\u9690\u53d8\u91cf, \u90a3\u4e48\u6574\u4e2a KL \u6563\u5ea6\u9879\u5c31\u662f\u6240\u6709\u7ef4\u5ea6\u4e0a\u7684 $\\mathrm{KL}$ \u6563\u5ea6\u4e4b\u548c, \u516c\u5f0f\u5982\u4e0b:<br \/>\n$$<br \/>\n\\sum_{i=1}^l\\left(\\frac{1}{2}\\left(\\mu_i^2+\\exp \\left(\\sigma_i\\right)-1-\\sigma_i\\right)\\right)<br \/>\n$$<\/p>\n<p>\u8fd9\u5c31\u662f\u5728 VAE \u635f\u5931\u51fd\u6570\u4e2d\u9700\u8981\u6700\u5c0f\u5316\u7684 KL \u6563\u5ea6\u9879\u7684\u516c\u5f0f\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=KKRe9LOjJRZ8&format=png&color=94D82D\" style=\"height:50px;display:inline\"> VAE\u7684\u7b80\u660e\u6307\u5bfc<\/h3>\n<hr \/>\n<p>\u4ece\u6a21\u578b\u89d2\u5ea6\u6765\u8bf4\uff0cVAE \u67b6\u6784\u5c31\u662f\u5728\u539f\u672c\u7684 AE \u7ed3\u6784\u4e0a, \u4e3a\u7f16\u7801\u6dfb\u52a0\u5408\u9002\u7684\u566a\u58f0\u3002\u5176\u539f\u56e0\u6734\u7d20\u7684\u81ea\u7f16\u7801\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u5f88\u5bb9\u6613\u51fa\u73b0<strong>\u8fc7\u62df\u5408\u73b0\u8c61<\/strong>\uff0c\u5982\u4e0b\u6240\u793a\uff1a<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160300426.png\" style=\"height:300px\">\n<\/p>\n<p>\u5047\u8bbe\u7528\u4e00\u4e9b\u5168\u6708\u56fe\u548c\u4e00\u4e9b\u534a\u6708\u56fe\u53bb\u8bad\u7ec3\u4e00\u4e2aAE\uff0c\u7ecf\u8fc7\u8bad\u7ec3\uff0c\u6a21\u578b\u80fd\u591f\u5f88\u597d\u5730\u8fd8\u539f\u51fa\u8fd9\u4e24\u5f20\u56fe\u7247\uff0c\u5982\u56fe5-2\u6240\u793a\u3002\u63a5\u4e0b\u6765\u5728\u6f5c\u7a7a\u95f4\u4e2d\u53d6\u4e24\u5f20\u56fe\u7247\u7f16\u7801\u70b9\u4e2d\u4efb\u610f\u4e00\u70b9\uff0c\u5c06\u8fd9\u70b9\u4ea4\u7ed9\u89e3\u7801\u5668\u8fdb\u884c\u89e3\u7801\uff0c\u76f4\u89c9\u4e0a\u4f1a\u5f97\u5230\u4e00\u5f20\u4ecb\u4e8e\u5168\u6708\u56fe\u548c\u534a\u6708\u56fe\u4e4b\u95f4\u7684\u56fe\u7247\uff08\u5982\u9634\u5f71\u9762\u79ef\u8986\u76d63\/4\uff09\u3002\u7136\u800c\uff0c\u5b9e\u9645\u4e0a\uff0c\u8fd9\u4e2a\u70b9\u7ecf\u8fc7\u89e3\u7801\u5668\u89e3\u7801\u540e\u7684\u7ed3\u679c\u4e0d\u4ec5\u6a21\u7cca\u800c\u4e14\u8fd8\u662f\u4e71\u7801\u7684\uff0c\u8fd9\u5c31\u662f\u81ea\u7f16\u7801\u7684\u8fc7\u62df\u5408\u73b0\u8c61\u3002<\/p>\n<p><strong>\u4e3a\u4ec0\u4e48\u4f1a\u51fa\u73b0\u8fd9\u79cd\u73b0\u8c61\uff1f<\/strong><\/p>\n<p>\u4e00\u4e2a\u76f4\u89c2\u4e0a\u7684\u89e3\u91ca\u662fAE\u7684Encoder\u548cDecoder\u90fd\u4f7f\u7528\u4e86\u795e\u7ecf\u7f51\u7edc\uff0c\u795e\u7ecf\u7f51\u7edc\u662f\u4e00\u4e2a\u975e\u7ebf\u6027\u7684\u53d8\u6362\u8fc7\u7a0b\uff0c\u56e0\u6b64\u5728\u6f5c\u7a7a\u95f4\u4e2d\u70b9\u4e0e\u70b9\u4e4b\u95f4\u5173\u7cfb\u5f80\u5f80\u6ca1\u6709\u89c4\u5f8b\u53ef\u5faa\u3002\u89e3\u51b3\u6b64\u95ee\u9898\u7684\u4e00\u79cd\u65b9\u6cd5\u662f\u5f15\u5165\u566a\u58f0\uff0c\u4f7f\u5f97\u56fe\u7247\u7684\u7f16\u7801\u533a\u57df\u5f97\u5230\u6269\u5927\uff0c\u4ece\u800c\u63a9\u76d6\u6389\u5931\u771f\u7684\u7a7a\u767d\u7f16\u7801\u70b9\u3002<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160357245.png\" style=\"height:300px\">\n<\/p>\n<p>\u5728\u5bf9\u4e24\u5f20\u56fe\u7247\u8fdb\u884c\u7f16\u7801\u65f6\u5f15\u5165\u4e00\u5b9a\u7684\u566a\u58f0\uff0c\u4f7f\u5f97\u6bcf\u4e2a\u56fe\u7247\u7684\u7f16\u7801\u70b9\u51fa\u73b0\u5728\u6f5c\u7a7a\u95f4\u7684\u77e9\u5f62\u9634\u5f71\u8303\u56f4\u5185\uff0c\u5982\u4e0a\u56fe\u6240\u793a\u3002<\/p>\n<ul>\n<li>\n<p>\u5728\u8bad\u7ec3\u6a21\u578b\u65f6\uff0c\u77e9\u5f62\u9634\u5f71\u8303\u56f4\u5185\u7684\u70b9\u90fd\u6709\u53ef\u80fd\u88ab\u91c7\u6837\u5230\uff0c\u8fd9\u6837\u89e3\u7801\u5668\u5728\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u4f1a\u5c3d\u53ef\u80fd\u5730\u5c06\u77e9\u5f62\u9634\u5f71\u5185\u7684\u70b9\u8fd8\u539f\u4e3a\u4e0e\u539f\u56fe\u76f8\u4f3c\u7684\u56fe\u7247\u3002\u63a5\u7740\uff0c\u5bf9\u4e4b\u524d\u63d0\u5230\u7684\u5931\u771f\u70b9\uff0c\u6b64\u65f6\u5b83\u4f4d\u4e8e\u5168\u6708\u56fe\u548c\u534a\u6708\u56fe\u7f16\u7801\u7684\u4ea4\u754c\u5904\u3002<\/p>\n<\/li>\n<li>\n<p>\u56e0\u6b64\uff0c\u89e3\u7801\u5668\u5e0c\u671b\u5931\u771f\u70b9\u65e2\u80fd\u5c3d\u91cf\u4e0e\u5168\u6708\u56fe\u76f8\u4f3c\uff0c\u53c8\u80fd\u5c3d\u91cf\u4e0e\u534a\u6708\u56fe\u76f8\u4f3c\uff0c\u56e0\u6b64\u5b83\u7684\u8fd8\u539f\u7ed3\u679c\u5c06\u662f\u4e24\u79cd\u56fe\u7684\u6298\u4e2d\uff08\u4f8b\u59823\/4\u7684\u5168\u6708\u56fe\uff09\u3002\u901a\u8fc7\u8fd9\u4e2a\u4f8b\u5b50\u53d1\u73b0\u7ed9\u7f16\u7801\u5668\u589e\u52a0\u4e00\u4e9b\u566a\u58f0\uff0c\u53ef\u4ee5\u6709\u6548\u8986\u76d6\u5931\u771f\u533a\u57df\u3002<\/p>\n<\/li>\n<li>\n<p>\u7136\u800c\uff0c\u5f15\u5165\u533a\u57df\u566a\u58f0\u7684\u65b9\u6cd5\u8fd8\u4e0d\u591f\u5145\u5206\uff0c\u56e0\u4e3a\u566a\u58f0\u7684\u8303\u56f4\u603b\u662f\u6709\u9650\u7684\uff0c\u4e0d\u53ef\u80fd\u8986\u76d6\u6240\u6709\u91c7\u6837\u70b9\u3002<\/p>\n<\/li>\n<li>\n<p>\u4e3a\u4e86\u89e3\u51b3\u6b64\u95ee\u9898\uff0c\u53ef\u4ee5\u5c1d\u8bd5\u5c06\u566a\u58f0\u7684\u8303\u56f4\u65e0\u9650\u5ef6\u4f38\uff0c\u4ee5\u4f7f\u5f97\u5bf9\u4e8e\u6bcf\u4e2a\u6837\u672c\uff0c\u5176\u7f16\u7801\u80fd\u591f\u8986\u76d6\u6574\u4e2a\u7f16\u7801\u7a7a\u95f4\u3002\u4f46\u662f\u9700\u8981\u786e\u4fdd\uff0c\u5728\u539f\u59cb\u7f16\u7801\u9644\u8fd1\u7684\u7f16\u7801\u70b9\u5177\u6709\u6700\u9ad8\u7684\u6982\u7387\uff0c\u968f\u7740\u79bb\u539f\u59cb\u7f16\u7801\u70b9\u7684\u8ddd\u79bb\u589e\u52a0\uff0c\u7f16\u7801\u7684\u6982\u7387\u9010\u6e10\u51cf\u5c0f\u3002<\/p>\n<\/li>\n<li>\n<p>\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\uff0c\u56fe\u50cf\u7684\u7f16\u7801\u5c06\u4ece\u539f\u6765\u79bb\u6563\u7684\u7f16\u7801\u70b9\u53d8\u6210\u4e00\u4e2a\u8fde\u7eed\u7684\u7f16\u7801\u5206\u5e03\u66f2\u7ebf\uff0c\u5982\u56fe<\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160429607.png\" style=\"height:300px\">\n<\/p>\n<p>\u8fd9\u79cd\u5c06\u56fe\u50cf\u7f16\u7801\u7531\u79bb\u6563\u53d8\u4e3a\u8fde\u7eed\u7684\u65b9\u6cd5\uff0c\u5c31\u662f\u53d8\u5206\u81ea\u7f16\u7801\u7684\u6838\u5fc3\u601d\u60f3\u3002\u5176\u7ed3\u6784\u5982\u4e0b\u56fe\uff1a<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160527396.png\" style=\"height:300px\">\n<\/p>\n<p>\u9996\u5148\u5c06 input \u8f93\u5165\u5230\u7f16\u7801\u5668, \u8ba1\u7b97\u51fa\u4e24\u7ec4\u7f16\u7801: <\/p>\n<ul>\n<li>\u4e00\u7ec4\u7f16\u7801\u4e3a\u5747\u503c\u7f16\u7801 $m=\\left(m_1, m_2, m_3\\right)$, <\/li>\n<li>\u53e6\u4e00\u7ec4\u4e3a\u63a7\u5236\u566a\u58f0\u5e72\u6270\u7a0b\u5ea6\u7684\u65b9\u5dee\u7f16\u7801 $\\sigma=\\left(\\sigma_1, \\sigma_2, \\sigma_3\\right)$, \u8fd9\u4e24\u7ec4\u53c2\u6570\u5206\u522b\u901a\u8fc7\u4e24\u4e2a\u795e\u7ecf\u7f51\u7edc\u8ba1\u7b97\u5f97\u5230\u3002<\/li>\n<li>\u5176\u4e2d\u65b9\u5dee\u7f16\u7801 $\\sigma$ \u4e3b\u8981\u7528\u6765\u4e3a\u566a\u97f3\u7f16\u7801 $z=\\left(e_1, e_2, e_3\\right)$ \u5206\u914d\u6743\u91cd, \u5728\u5206\u914d\u6743\u91cd\u4e4b\u524d\u5bf9\u65b9\u5dee\u7f16\u7801 $\\sigma$ \u8fdb\u884c\u4e86\u6307\u6570\u8fd0\u7b97\uff0c\u4e3b\u8981\u662f\u56e0\u4e3a\u795e\u7ecf\u7f51\u7edc\u5b66\u4e60\u51fa\u6765\u7684\u6743\u91cd\u503c\u662f\u6709\u6b63\u8d1f\u503c\u7684\uff0c\u52a0\u5165\u6307\u6570\u8fd0\u7b97\u4fdd\u8bc1\u5206\u914d\u5230\u7684\u6743\u91cd\u662f\u6b63\u503c\u3002<\/li>\n<li>\u6700\u540e\uff0c\u5c06\u539f\u7f16\u7801 $m$ \u548c\u7ecf\u8fc7\u6743\u91cd\u5206\u914d\u540e\u566a\u58f0\u7f16\u7801\u8fdb\u884c\u53e0\u52a0\uff0c\u5f97\u5230\u65b0\u7684\u9690\u53d8\u91cf\uff0c\u518d\u9001\u5165\u89e3\u7801\u5668\u3002<\/li>\n<li>\u635f\u5931\u51fd\u6570\u8fd9\u4e00\u9879\u9664\u4e86\u4e4b\u524d\u4f20\u7edfAE\u7684\u91cd\u6784\u635f\u5931\u4ee5\u5916, \u8fd8\u591a\u4e86\u4e00\u9879\u635f\u5931: $\\sum_{i=1}^3\\left(\\exp \\left(\\sigma_i\\right)-\\left(1+\\sigma_i\\right)+\\left(m_i\\right)^2\\right)$ \u3002\u8fd9\u4e00\u9879\u7684\u63a8\u5bfc\u5728\u4e4b\u524d\u4e5f\u5df2\u7ecf\u8bc1\u660e\u8fc7\u3002<\/li>\n<\/ul>\n<p>\u5b9e\u9645\u4e0a\uff0c\u4e5f\u53ef\u4ee5\u8fd0\u7528<strong>\u53cd\u8bc1\u6cd5<\/strong>\u7684\u601d\u60f3\u6765\u63a8\u6572\u8fd9\u4e2a\u65b0\u635f\u5931\u7684\u610f\u4e49\u3002<\/p>\n<ul>\n<li>\u5f53\u4e0d\u5f15\u5165\u8fd9\u4e2a\u635f\u5931\u51fd\u6570\u65f6\uff0c\u6a21\u578b\u4f1a\u52aa\u529b\u51cf\u5c11\u751f\u6210\u56fe\u7247\u7684\u91cd\u6784\u8bef\u5dee\u6765\u63d0\u9ad8\u56fe\u7247\u8d28\u91cf\u3002<\/li>\n<li>\u4e3a\u4e86\u5b9e\u73b0\u8fd9\u4e00\u70b9\uff0c\u7f16\u7801\u5668\u4f1a\u671f\u671b\u51cf\u5c11\u566a\u97f3\u5bf9\u751f\u6210\u56fe\u7247\u7684\u5f71\u54cd\uff0c\u964d\u4f4e\u4efb\u52a1\u96be\u5ea6\u3002\u56e0\u6b64\uff0c\u5b83\u4f1a\u503e\u5411\u4e8e\u7ed9\u566a\u97f3\u5206\u914d\u8f83\u4f4e\u7684\u6743\u91cd\u3002\u5982\u679c\u6ca1\u6709\u4efb\u4f55\u7ea6\u675f\u9650\u5236\uff0c\u7f51\u7edc\u53ea\u9700\u8981\u5c06\u65b9\u5dee\u7f16\u7801\u8bbe\u7f6e\u4e3a\u63a5\u8fd1\u8d1f\u65e0\u7a77\u5927\u7684\u503c\uff0c\u4ece\u800c\u6d88\u9664\u566a\u97f3\u7684\u5f71\u54cd\u3002<\/li>\n<li>\u5373$(\\exp \\left(\\sigma_i\\right) e_i=0)$, \u6b64\u65f6 $(m_i)$ \u5c31\u7b49\u4e8e\u5b83\u672c\u8eab, \u6a21\u578b\u5c31\u9000\u5316\u6210\u4e86\u666e\u901a\u7684\u81ea\u7f16\u7801\u5668\uff0c\u8fc7\u62df\u5408\u95ee\u9898\u5c31\u4f1a\u5377\u571f\u91cd\u6765\u3002\u5c3d\u7ba1\u6b64\u65f6\u6a21\u578b\u7684\u8bad\u7ec3\u6548\u679c\u53ef\u80fd\u975e\u5e38\u597d\uff0c\u4f46\u751f\u6210\u7684\u56fe\u7247\u5f80\u5f80\u4f1a\u975e\u5e38\u7cdf\u7cd5\u3002<\/li>\n<\/ul>\n<p>\u4e3a\u4e86\u65b9\u4fbf\u7406\u89e3\uff0c\u53ef\u4ee5\u505a\u4e00\u4e2a\u751f\u6d3b\u4e2d\u7684\u7c7b\u6bd4\u3002\u5c06\u53d8\u5206\u81ea\u7f16\u7801\u5668\uff08VAE\uff09\u7684\u5de5\u4f5c\u8fc7\u7a0b\u7c7b\u6bd4\u4e3a\u53c2\u52a0\u9ad8\u8003\u7684\u8fc7\u7a0b\u3002\u5728\u5b66\u751f\u4eec\u51c6\u5907\u9ad8\u8003\u7684\u8fc7\u7a0b\u4e2d\uff0c\u4ed6\u4eec\u9700\u8981\u8fdb\u884c\u5927\u91cf\u7684\u6a21\u62df\u8003\u8bd5\u4ee5\u63d0\u9ad8\u6700\u7ec8\u7684\u8003\u8bd5\u6210\u7ee9\uff0c\u8fd9\u5c31\u50cfVAE\u5728\u8bad\u7ec3\u9636\u6bb5\u6240\u505a\u7684\u4e8b\u60c5\u3002 \u6a21\u62df\u8003\u8bd5\u7684\u9898\u76ee\u548c\u96be\u5ea6\u7531\u8001\u5e08\u5b89\u6392\uff0c\u8fd9\u80fd\u591f\u516c\u6b63\u5730\u8bc4\u4f30\u5b66\u751f\u4eec\u7684\u5b66\u4e60\u80fd\u529b\u3002\u7c7b\u4f3c\u5730\uff0cVAE\u7684\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\uff0c\u5b83\u751f\u6210\u7684\u6570\u636e\u7684\u5206\u5e03\uff0c\u5373\u65b9\u5dee\u7f16\u7801 \u03c3\uff0c<\/p>\n<p>\u5e94\u7531\u67d0\u4e2a\u635f\u5931\u51fd\u6570\uff08\u53ef\u4ee5\u7406\u89e3\u4e3a\u201c\u8001\u5e08\u201d\uff09\u6765\u51b3\u5b9a\u3002\u5982\u679c\u6ca1\u6709\u8001\u5e08\u7684\u76d1\u7763\uff0c\u8ba9\u5b66\u751f\u4eec\u81ea\u5df1\u8bbe\u7f6e\u6a21\u62df\u8003\u8bd5\u7684\u96be\u5ea6\uff0c\u4ed6\u4eec\u5f88\u53ef\u80fd\u5c06\u8bd5\u9898\u8bbe\u5f97\u975e\u5e38\u7b80\u5355\uff0c\u4ee5\u4fbf\u5f97\u9ad8\u5206\u3002\u8fd9\u5c31\u50cfVAE\u5728\u6ca1\u6709\u9002\u5f53\u7684\u635f\u5931\u51fd\u6570\u7ea6\u675f\u65f6\uff0c\u53ef\u80fd\u503e\u5411\u4e8e\u964d\u4f4e\u566a\u58f0\u7684\u5f71\u54cd\uff0c\u8ba9\u6a21\u578b\u91cd\u6784\u8bef\u5dee\u5c3d\u53ef\u80fd\u5730\u63a5\u8fd1\u4e8e\u96f6\u3002 \u56e0\u6b64\uff0c\u4e3a\u4e86\u4fdd\u8bc1VAE\u5728\u5b9e\u9645\u5e94\u7528\u4e2d\u7684\u8868\u73b0\uff0c\u800c\u4e0d\u662f\u901a\u8fc7\u964d\u4f4e\u566a\u58f0\u5f71\u54cd\u6765\u201c\u6295\u673a\u53d6\u5de7\u201d\uff0c\u9700\u8981\u5f15\u5165\u4e00\u4e2a\u9002\u5f53\u7684\u635f\u5931\u51fd\u6570\u3002\u8fd9\u4e2a\u635f\u5931\u51fd\u6570\u5c31\u50cf\u8001\u5e08\u4e00\u6837\uff0c\u76d1\u7763VAE\u7684\u8bad\u7ec3\u8fc7\u7a0b\uff0c\u786e\u4fdd\u6a21\u578b\u5728\u9002\u5f53\u7684\u96be\u5ea6\u4e0b\u8fdb\u884c\u8bad\u7ec3\uff0c\u4ece\u800c\u80fd\u591f\u5728\u590d\u6742\u7684\u771f\u5b9e\u4e16\u754c\u4efb\u52a1\u4e2d\u8868\u73b0\u5f97\u66f4\u597d\u3002<\/p>\n<p>\u6240\u4ee5\uff0c\u9664\u4e86\u91cd\u6784\u8bef\u5dee\uff0cVAE\u8fd8\u8ba9\u6240\u6709\u7684<strong>\u77e9\u9635 c \u90fd\u5411\u6807\u51c6\u6b63\u6001\u5206\u5e03\u770b\u9f50<\/strong>\uff0c\u8fd9\u6837\u5c31\u9632\u6b62\u4e86$(\\exp \\left(\\sigma_i\\right) e_i=0)$\uff0c\u8fdb\u800c\u9020\u6210\u566a\u58f0\u4e3a\u96f6\u7684\u60c5\u51b5\uff0c\u4fdd\u8bc1VAE\u6a21\u578b\u4e0d\u4f1a\u9000\u5316\u6210\u81ea\u7f16\u7801\u5668\uff0c\u5982\u4e0b\u56fe<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160757393.png\" style=\"height:400px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=48130&format=png&color=000000\" style=\"height:50px;display:inline\"> Seq2Seq Models<\/h3>\n<hr \/>\n<p>Seq2Seq\u6a21\u578b\u53ef\u4ee5\u88ab\u8ba4\u4e3a\u662f\u4e00\u79cdEncoder-Decoder\u6a21\u578b\u7684\u53d8\u4f53\uff0c\u5176\u7279\u522b\u9002\u7528\u4e8e\u5904\u7406\u5e8f\u5217\u5230\u5e8f\u5217\u7684\u4efb\u52a1\uff0c\u7f16\u7801\u5668\u5c06\u8f93\u5165\u5e8f\u5217\u6620\u5c04\u4e3a\u4e00\u4e2a\u56fa\u5b9a\u957f\u5ea6\u7684\u5411\u91cf\u8868\u793a\uff0c\u89e3\u7801\u5668\u5219\u4f7f\u7528\u8fd9\u4e2a\u5411\u91cf\u8868\u793a\u6765\u751f\u6210\u8f93\u51fa\u5e8f\u5217\u3002\u5b83\u5df2\u88ab\u5e7f\u6cdb\u5e94\u7528\u4e8e\u673a\u5668\u7ffb\u8bd1\u3001\u5bf9\u8bdd\u7cfb\u7edf\u3001\u8bed\u97f3\u8bc6\u522b\u7b49\u81ea\u7136\u8bed\u8a00\u5904\u7406\u4efb\u52a1\u3002Seq2Seq\u6a21\u578b\u7684\u7ed3\u679c\u6846\u67b6\u5982\u56fe<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914160831160.png\" style=\"height:300px\">\n<\/p>\n<p>\u4ee5\u673a\u5668\u7ffb\u8bd1\u4efb\u52a1\u4e3a\u4f8b\u6765\u8bb2\u89e3\u56fe1\u7684Seq2Seq\u7ed3\u6784\u3002\u5047\u8bbe\u73b0\u5728\u7684\u6a21\u578b\u8f93\u5165\u662f\u6587\u672c\u5e8f\u5217\u201chello world\u201d\uff0c\u60f3\u5f97\u5230\u8f93\u51fa\u7ed3\u679c\u4e3a\u201c\u4f60\u597d\uff0c\u4e16\u754c\u201d\u3002\u5728Seq2Seq\u6a21\u578b\u4e2d\uff0c<strong>\u7f16\u7801\u5668\u8d1f\u8d23\u5c06\u8f93\u5165\u5e8f\u5217\u6620\u5c04\u5230\u4e00\u4e2a\u7279\u5f81\u5411\u91cf c<\/strong>\uff0c\u5e0c\u671b\u901a\u8fc7\u8bad\u7ec3\u53ef\u4ee5\u8ba9\u8be5\u5411\u91cf\u63d0\u53d6\u5230\u8f93\u5165\u4fe1\u606f\u7684\u8bed\u4e49\u7279\u5f81\uff0c\u5c06\u6765\u9001\u5165\u89e3\u7801\u5668\u4e2d\u4f5c\u4e3a\u89e3\u7801\u5668\u7684\u4e00\u90e8\u5206\u8f93\u5165\u4fe1\u606f\u3002\u7f16\u7801\u5668\u901a\u5e38\u91c7\u7528\u5faa\u73af\u795e\u7ecf\u7f51\u7edc\u6216\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\u6765\u5904\u7406\u8f93\u5165\u5e8f\u5217\u3002<\/p>\n<p>\u89e3\u7801\u5668\u8d1f\u8d23\u751f\u6210\u8f93\u51fa\u5e8f\u5217\uff0c\u5b83\u901a\u5e38\u4e5f\u91c7\u7528\u5faa\u73af\u795e\u7ecf\u7f51\u7edc\u6216\u5176\u5b83\u53d8\u4f53\u6765\u5904\u7406\u8f93\u51fa\u5e8f\u5217\u3002\u4e0e\u7f16\u7801\u5668\u7c7b\u4f3c\uff0c\u89e3\u7801\u5668\u4e5f\u7531\u591a\u4e2a\u65f6\u95f4\u6b65\u7ec4\u6210\u3002\u5728\u6bcf\u4e2a\u65f6\u95f4\u6b65\uff0c\u89e3\u7801\u5668\u4f7f\u7528\u524d\u4e00\u4e2a\u65f6\u95f4\u6b65\u7684\u8f93\u51fa\u5143\u7d20\u548c\u5f53\u524d\u65f6\u95f4\u6b65\u7684\u9690\u85cf\u72b6\u6001\u6765\u751f\u6210\u5f53\u524d\u65f6\u95f4\u6b65\u7684\u8f93\u51fa\u5143\u7d20\u3002\u5177\u4f53\u6765\u8bf4\uff0c\u5728\u89e3\u7801\u5668\u7684\u521d\u59cb\u65f6\u523b, \u901a\u5e38\u4f1a\u5c06\u7f16\u7801\u5668\u8f93\u51fa\u7684\u7279\u5f81\u5411\u91cf $c$ \u548c\u8d77\u59cb\u5411\u91cf$bos$\u62fc\u63a5\u5230\u4e00\u8d77,\u4f5c\u4e3a\u521d\u59cb\u65f6\u523b\u7684\u9690\u85cf\u72b6\u6001 $s_0$, \u5e76\u8ba1\u7b97\u5f97\u5230\u5f53\u524d\u65f6\u523b\u7684\u8f93\u51fa\u4fe1\u606f $y_1$ \u3002\u5728\u4e0b\u4e00\u65f6\u523b  $t_1$, \u4f1a\u5c06\u7f16\u7801\u5668\u8f93\u51fa\u7684\u7279\u5f81\u5411\u91cf $\\boldsymbol{c}$ \u548c $y_1$ \u62fc\u63a5\u6210\u4e00\u4e2a\u5411\u91cf\u4f5c\u4e3a\u5f53\u524d\u65f6\u523b $t_1$ \u7684\u8f93\u5165, \u5e76\u8ba1\u7b97\u5f97\u5230\u5f53\u524d\u65f6\u523b\u7684\u8f93\u51fa $y_2$ \u4ee5\u6b64\u7c7b\u63a8\u4e0b\u53bb, \u76f4\u5230\u6a21\u578b\u9884\u6d4b\u51fa\u7ed3\u675f\u5411\u91cf$eos$\u65f6\u505c\u6b62\u3002\u4e0d\u96be\u53d1\u73b0, \u89e3\u7801\u5668\u7684\u8f93\u5165\u4fe1\u606f\u4e0d\u4ec5\u5305\u542b\u4e86\u5f53\u524d\u8981\u7ffb\u8bd1\u7684\u5355\u8bcd\uff0c\u8fd8\u5305\u542b\u4e86\u5f53\u524d\u8fd9\u53e5\u8bdd\u7684\u4e0a\u4e0b\u6587\u8bed\u4e49\u4fe1\u606f; \u89e3\u7801\u5668\u4f1a\u7ed3\u5408\u8fd9\u4e24\u90e8\u5206\u4fe1\u606f\u9884\u6d4b\u5f53\u524d\u7684\u8f93\u51fa\uff0c\u8fd9\u5c31\u662f\u89e3\u7801\u5668\u7ffb\u8bd1\u6587\u672c\u4e3a\u4ec0\u4e48\u6709\u6548\u3002<\/p>\n<h3>Seq2Seq\u7684attention\u673a\u5236<\/h3>\n<h4><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=91CnU00i6HLv&format=png&color=000000\" style=\"height:50px;display:inline\"> \u8003\u8651\u5728\u89e3\u7801\u5668\u7ffb\u8bd1\u6bcf\u4e00\u4e2a\u5355\u8bcd\u65f6\u90fd\u7528\u8fd9\u540c\u4e00\u4e2a\u7279\u5f81\u5411\u91cf c \u4f5c\u4e3a\u8f93\u5165\u662f\u5426\u5408\u9002\uff1f<\/h4>\n<hr \/>\n<p><strong>\u5bf9\u6ce8\u610f\u529b\u5206\u6570\u8fdb\u884c\u5efa\u6a21<\/strong><\/p>\n<p>\u5047\u8bbe\u8f93\u5165\u5e8f\u5217\u662f $x_1, x_2, \\cdots, x_n$, \u5bf9\u5e94\u7684\u7f16\u7801\u5668\u9690\u85cf\u72b6\u6001\u4e3a $h_1, h_2, \\cdots, h_n$  \u3002\u89e3\u7801\u5668\u7684\u5f53\u524d\u9690\u85cf\u72b6\u6001\u4e3a $s_t$\u3002\u9996\u5148\u8ba1\u7b97\u89e3\u7801\u5668\u9690\u85cf\u72b6\u6001 $s_t$ \u4e0e\u6240\u6709\u7f16\u7801\u5668\u9690\u85cf\u72b6\u6001 $h_i$ \u4e4b\u95f4\u7684\u76f8\u4f3c\u6027\u5206\u6570 $e_{t, i}$ \u3002\u53ef\u4ee5\u901a\u8fc7\u70b9\u79ef\u3001\u52a0\u6743\u70b9\u79ef\u6216\u5176\u4ed6\u76f8\u4f3c\u6027\u5ea6\u91cf\u6765\u8ba1\u7b97:<br \/>\n$$<br \/>\ne_{t, i}=\\operatorname{score}\\left(s_t, h_i\\right)<br \/>\n$$<\/p>\n<p>$s_t$ \u548c\u67d0\u4e9b $h_i$ \u8d8a\u76f8\u4f3c\uff0c\u610f\u5473\u7740\u8fd9\u4e9b\u7f16\u7801\u5668\u9690\u85cf\u72b6\u6001\u7684\u8bed\u4e49\u4fe1\u606f $h_i$ \u5c31\u662f\u5728\u7ffb\u8bd1\u6587\u672c $x_t$ \u65f6\u5e94\u8be5\u7740\u91cd\u5173\u6ce8\u7684\u5730\u65b9\u3002 <\/p>\n<p>\u63a5\u4e0b\u6765, \u5c06\u8fd9\u4e9b\u5206\u6570\u8f6c\u6362\u4e3a\u6743\u91cd, \u901a\u8fc7 softmax \u51fd\u6570\u8fdb\u884c\u5f52\u4e00\u5316:<br \/>\n$$<br \/>\n\\alpha_{t, i}=\\frac{\\exp \\left(e_{t, i}\\right)}{\\sum_{j=1}^n \\exp \\left(e_{t, i}\\right)}<br \/>\n$$<\/p>\n<p>\u5176\u4e2d$\\alpha_{t, i}$  \u53ef\u4ee5\u770b\u4f5c\u662f\u89e3\u7801\u5668\u5728\u65f6\u95f4\u6b65 $t$ \u65f6\u5173\u6ce8\u7f16\u7801\u5668\u9690\u85cf\u72b6\u6001 $h_i$ \u7684\u6743\u91cd\u3002\u63a5\u7740\u8ba1\u7b97\u52a0\u6743\u548c\u7684\u4e0a\u4e0b\u6587\u5411\u91cf $c_t$, \u516c\u5f0f\u5982\u4e0b:<br \/>\n$$<br \/>\nc_t=\\sum_{i=1}^n \\alpha_{t, i} h_i<br \/>\n$$<\/p>\n<p>\u4e0a\u4e0b\u6587\u5411\u91cf $c_t$ \u6355\u83b7\u4e86\u8f93\u5165\u5e8f\u5217\u4e0e\u89e3\u7801\u5668\u5f53\u524d\u65f6\u95f4\u6b65\u7684\u76f8\u5173\u6027\u3002\u7136\u540e, \u5c06\u4e0a\u4e0b\u6587\u5411\u91cf $c_t$ \u4e0e\u89e3\u7801\u5668\u7684\u9690\u85cf\u72b6\u6001$s_t$ \u7ed3\u5408\u8d77\u6765\uff0c\u4ee5\u751f\u6210\u4e0b\u4e00\u4e2a\u8f93\u51fa\u5355\u8bcd\u7684\u6982\u7387\u5206\u5e03, \u516c\u5f0f\u5982\u4e0b:<br \/>\n$$<br \/>\ny_t=\\operatorname{softmax}\\left(W\\left(s_t, c_t\\right)+b\\right)<br \/>\n$$<\/p>\n<p>\u5176\u4e2d $\\mathrm{W}$ \u548c $\\mathrm{b}$ \u662f\u9700\u8981\u5b66\u4e60\u7684\u6743\u91cd\u548c\u504f\u7f6e\u9879\u3002\u901a\u8fc7\u8fd9\u79cd\u65b9\u5f0f\uff0c\u6ce8\u610f\u529b\u673a\u5236\u5e2e\u52a9Seq2Seq\u6a21\u578b\u5173\u6ce8\u8f93\u5165\u5e8f\u5217\u7684\u91cd\u8981\u90e8\u5206\uff0c\u4ece\u800c\u63d0\u9ad8\u4e86\u9884\u6d4b\u6027\u80fd\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/cute-clipart\/64\/000000\/task.png\" style=\"height:50px;display:inline\"> \u81ea\u76d1\u7763\u5b66\u4e60\u65b9\u6cd5\u4e0e\u6a21\u578b\uff08Self-Supervised Learning\uff09<\/h3>\n<hr \/>\n<ul>\n<li>\n<p>\u4e00\u79cd\u65e0\u76d1\u7763\u5b66\u4e60\u7684\u7248\u672c\uff0c\u5176\u4e2d<strong>\u6570\u636e\u63d0\u4f9b\u76d1\u7763<\/strong>\u3002<\/p>\n<\/li>\n<li>\n<p><strong>\u6838\u5fc3\u60f3\u6cd5<\/strong>\uff1a\u4fdd\u7559\u90e8\u5206\u6570\u636e\uff0c\u7136\u540e\u8ba9\u795e\u7ecf\u7f51\u7edc\u6839\u636e\u5269\u4f59\u90e8\u5206\u8fdb\u884c\u9884\u6d4b\u3002<\/p>\n<\/li>\n<li>\n<p>\u4f18\u4e8e\u76d1\u7763\u5b66\u4e60\u7684\u5730\u65b9\uff1a<\/p>\n<ol>\n<li>\u4e3a\u6bcf\u4e2a\u4efb\u52a1\u751f\u6210\u65b0\u6570\u636e\u96c6\u7684\u6210\u672c\u5f88\u9ad8\uff08\u51c6\u5907\u6807\u7b7e\u624b\u518c\u3001\u7c7b\u522b\u3001\u96c7\u7528\u4eba\u5458\u3001\u521b\u5efa GUI\u3001\u5b58\u50a8\u7ba1\u9053\u7b49\uff09\u3002<\/li>\n<li>\u826f\u597d\u7684\u76d1\u7763\u53ef\u80fd\u5e76\u4e0d\u4fbf\u5b9c\uff08\u4f8b\u5982\uff0c\u533b\u5b66\u3001\u6cd5\u5f8b\uff09\u3002<\/li>\n<li>\u5229\u7528\u4e92\u8054\u7f51\u4e0a\u5927\u91cf\u672a\u6807\u8bb0\u7684\u6570\u636e\uff08\u56fe\u50cf\u3001\u89c6\u9891\u3001\u8bed\u8a00\uff09\u3002<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914161617404.png\" style=\"height:300px\">\n<\/p>\n<ul>\n<li>\u4ece\u8fc7\u53bb\u7684\u6570\u636e\u9884\u6d4b\u672a\u6765\u7684\u6570\u636e\u3002\u8fd9\u5728\u65f6\u95f4\u5e8f\u5217\u9884\u6d4b\uff08\u5982\u80a1\u4ef7\u9884\u6d4b\u3001\u5929\u6c14\u9884\u62a5\uff09\u4e2d\u975e\u5e38\u5e38\u89c1\u3002<\/li>\n<li>\u4ece\u6700\u8fd1\u7684\u8fc7\u53bb\u9884\u6d4b\u5373\u5c06\u53d1\u751f\u7684\u4e8b\u4ef6\u3002\u8fd9\u79cd\u65b9\u6cd5\u5728\u77ed\u671f\u9884\u6d4b\u4e2d\u5c24\u4e3a\u6709\u6548\u3002<\/li>\n<li>\u4ece\u5f53\u524d\u7684\u6570\u636e\u9884\u6d4b\u8fc7\u53bb\u7684\u6570\u636e\u3002\u8fd9\u5728\u6570\u636e\u4fee\u590d\u548c\u7f3a\u5931\u6570\u636e\u586b\u5145\u4e2d\u6709\u5e94\u7528\u3002<\/li>\n<li>\u5728\u7a7a\u95f4\u5e8f\u5217\u6570\u636e\u4e2d\uff0c\u4ece\u4e0b\u90e8\u6570\u636e\u9884\u6d4b\u4e0a\u90e8\u6570\u636e\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u6570\u636e\u4e2d\uff0c\u53ef\u4ee5\u4ece\u56fe\u50cf\u7684\u4e0b\u534a\u90e8\u5206\u9884\u6d4b\u4e0a\u534a\u90e8\u5206\u3002<\/li>\n<li>\u4ece\u53ef\u89c1\u7684\u90e8\u5206\u9884\u6d4b\u88ab\u906e\u6321\u7684\u90e8\u5206\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u5904\u7406\u4e2d\uff0c\u4ece\u672a\u88ab\u906e\u6321\u7684\u90e8\u5206\u9884\u6d4b\u88ab\u906e\u6321\u7684\u90e8\u5206\u3002<\/li>\n<\/ul>\n<hr \/>\n<p><strong>\u4e0b\u9762\u4ecb\u7ecd\u4e00\u4e9b\u7ecf\u5178\u7684\u4f7f\u7528\u81ea\u76d1\u7763\u7684\u6a21\u578b\u3002<\/strong><\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/external-wanicon-lineal-color-wanicon\/64\/null\/external-mask-brazilian-carnival-wanicon-lineal-color-wanicon.png\" style=\"height:50px;display:inline\">  Masked Autoencoders (Vision Transformers)<\/h3>\n<hr \/>\n<ul>\n<li>\u867d\u7136\u81ea\u76d1\u7763\u9886\u57df\u5df2\u7ecf\u968f\u7740\u5bf9\u6bd4\u65b9\u6cd5\u7684\u5f15\u5165\u800c\u98ce\u9761\u4e00\u65f6\uff0c\u4f46 Vision Transformers (ViT) \u7684\u6700\u65b0\u53d1\u5c55\u91cd\u65b0\u5524\u8d77\u4e86\u8499\u7248\u56fe\u50cf\u5efa\u6a21\u7684\u7b80\u5355\u60f3\u6cd5\u3002<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/2111.06377\">Masked Autoencoders Are Scalable Vision Learners, He et al. 2021.<\/a><\/li>\n<li><strong>Masked Autoencoders<\/strong>\uff1a\u5728\u9884\u8bad\u7ec3\u671f\u95f4\uff0c<strong>\u5927\u91cf\u968f\u673a\u56fe\u50cf\u5757\u5b50\u96c6<\/strong>\uff08\u4f8b\u5982 75%\uff09\u88ab\u8499\u7248\u3002\u7f16\u7801\u5668\u5e94\u7528\u4e8e\u53ef\u89c1\u5757\u7684\u5c0f\u5b50\u96c6\u3002\u5728\u7f16\u7801\u5668\u4e4b\u540e\u5f15\u5165\u8499\u7248\u6807\u8bb0\uff0c\u5168\u5957\u7f16\u7801\u5757\u548c\u8499\u7248\u6807\u8bb0\u7531\u5c0f\u578b\u89e3\u7801\u5668\u5904\u7406\uff0c\u8be5\u89e3\u7801\u5668\u4ee5\u50cf\u7d20\u4e3a\u5355\u4f4d\u91cd\u5efa\u539f\u59cb\u56fe\u50cf\u3002\u9884\u8bad\u7ec3\u540e\uff0c\u89e3\u7801\u5668\u88ab\u4e22\u5f03\uff0c\u7f16\u7801\u5668\u5e94\u7528\u4e8e\u672a\u635f\u574f\u7684\u56fe\u50cf\uff08\u5168\u5957\u5757\uff09\u4ee5\u6267\u884c\u8bc6\u522b\u4efb\u52a1\u3002<\/li>\n<li>\u63a9\u853d AE \u8868\u73b0\u51fa\u4e0e\u5bf9\u6bd4\u65b9\u6cd5\u76f8\u5f53\u7684\u6027\u80fd\uff0c\u540c\u65f6\u66f4\u6613\u4e8e\u5b9e\u73b0\u548c\u7406\u89e3\u3002<\/li>\n<li>Code:\n<ul>\n<li>HuggingFace: <a href=\"https:\/\/huggingface.co\/docs\/transformers\/model_doc\/vit_mae\">ViTMAE<\/a><\/li>\n<li>GitHub: <a href=\"https:\/\/github.com\/facebookresearch\/mae\">Official PyTorch implementation (FAIR)<\/a>, <a href=\"https:\/\/github.com\/EdisonLeeeee\/Awesome-Masked-Autoencoders\">Awesome MAE Models<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914161707224.png\" style=\"height:300px\">\n<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914161742331.png\" style=\"height:220px\">\n<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914161818368.png\" style=\"height:200px\">\n<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/?size=100&id=pVCJSBDTZxYl&format=png&color=000000\" style=\"height:50px;display:inline\">  Bert And GPT<\/h3>\n<hr \/>\n<p>\u5728\u81ea\u76d1\u7763\u5b66\u4e60\u80cc\u666f\u4e0b\uff0cBERT (Bidirectional Encoder Representations from Transformers) \u548c GPT (Generative Pre-trained Transformer) \u662f\u81ea\u7136\u8bed\u8a00\u5904\u7406\u9886\u57df\u975e\u5e38\u7ecf\u5178\u7684\u5de5\u4f5c\uff1a<\/p>\n<p><strong>BERT\u7684\u8bad\u7ec3\u65b9\u5f0f<\/strong><\/p>\n<ul>\n<li>\n<p>BERT\u7684\u8bad\u7ec3\u76ee\u6807\u662fMasked Language Modeling (MLM) \u548c Next Sentence Prediction (NSP):<\/p>\n<\/li>\n<li>\n<p>Masked Language Modeling (MLM):<\/p>\n<\/li>\n<\/ul>\n<p>\u5728\u8f93\u5165\u53e5\u5b50\u4e2d\u968f\u673a\u9009\u62e9\u4e00\u4e9b\u5355\u8bcd\uff0c\u5c06\u5b83\u4eec\u66ff\u6362\u4e3a\u4e00\u4e2a\u7279\u6b8a\u7684\u6807\u8bb0\uff08\u5982[MASK]\uff09\u3002<\/p>\n<p>\u6a21\u578b\u7684\u4efb\u52a1\u662f\u57fa\u4e8e\u4e0a\u4e0b\u6587\u9884\u6d4b\u88ab\u63a9\u76d6\u7684\u5355\u8bcd\u3002\u4f8b\u5982\uff0c\u5bf9\u4e8e\u53e5\u5b50 &quot;The cat sat on the [MASK]&quot;, \u6a21\u578b\u9700\u8981\u9884\u6d4b[MASK]\u5904\u7684\u5355\u8bcd &quot;mat&quot;\u3002<\/p>\n<p>\u8fd9\u79cd\u65b9\u6cd5\u8ba9\u6a21\u578b\u5728\u9884\u8bad\u7ec3\u9636\u6bb5\u5b66\u4f1a\u5229\u7528\u5de6\u53f3\u4e0a\u4e0b\u6587\u4fe1\u606f\uff0c\u4ece\u800c\u83b7\u5f97\u53cc\u5411\u8868\u793a\u3002<\/p>\n<ul>\n<li>Next Sentence Prediction (NSP):<\/li>\n<\/ul>\n<p>\u7ed9\u5b9a\u4e24\u4e2a\u53e5\u5b50A\u548cB\uff0c\u6a21\u578b\u9700\u8981\u9884\u6d4bB\u662f\u5426\u4e3aA\u7684\u4e0b\u4e00\u4e2a\u53e5\u5b50\u3002<\/p>\n<p>\u8bad\u7ec3\u6570\u636e\u753150%\u7684\u6b63\u6837\u672c\uff08\u5373B\u662f\u771f\u6b63\u7684\u4e0b\u4e00\u4e2a\u53e5\u5b50\uff09\u548c50%\u7684\u8d1f\u6837\u672c\uff08\u5373B\u662f\u4ece\u8bed\u6599\u5e93\u4e2d\u968f\u673a\u9009\u62e9\u7684\u53e5\u5b50\uff09\u7ec4\u6210\u3002<\/p>\n<p>\u8fd9\u4e00\u4efb\u52a1\u5e2e\u52a9\u6a21\u578b\u5b66\u4e60\u53e5\u5b50\u7ea7\u522b\u7684\u5173\u7cfb\u548c\u4e0a\u4e0b\u6587\u8fde\u63a5\u3002<\/p>\n<p><strong>GPT\u7684\u8bad\u7ec3\u65b9\u5f0f<\/strong><\/p>\n<ul>\n<li>GPT\u7684\u8bad\u7ec3\u76ee\u6807\u662fCausal Language Modeling (CLM):<\/li>\n<\/ul>\n<p>\u4f7f\u7528\u6807\u51c6\u7684\u81ea\u56de\u5f52\u8bed\u8a00\u6a21\u578b\uff0c\u5373\u7ed9\u5b9a\u524d\u9762\u6240\u6709\u7684\u5355\u8bcd\uff0c\u9884\u6d4b\u4e0b\u4e00\u4e2a\u5355\u8bcd\u3002<\/p>\n<p>\u4f8b\u5982\uff0c\u5bf9\u4e8e\u53e5\u5b50 &quot;The cat sat on the&quot;, \u6a21\u578b\u9700\u8981\u9884\u6d4b\u4e0b\u4e00\u4e2a\u5355\u8bcd &quot;mat&quot;\u3002<\/p>\n<p>\u6a21\u578b\u53ea\u5229\u7528\u524d\u5411\u4e0a\u4e0b\u6587\uff08\u5373\u5de6\u4fa7\u7684\u5355\u8bcd\uff09\uff0c\u4e0d\u4f1a\u5229\u7528\u53f3\u4fa7\u7684\u4e0a\u4e0b\u6587\u4fe1\u606f\u3002\u8fd9\u79cd\u8bad\u7ec3\u65b9\u5f0f\u88ab\u79f0\u4e3a\u5355\u5411\u6216\u81ea\u56de\u5f52\u8bad\u7ec3\u3002<\/p>\n<p>\u5c3d\u7ba1BERT\u548cGPT\u7684\u8bad\u7ec3\u65b9\u5f0f\u6709\u6240\u4e0d\u540c\uff0c\u5b83\u4eec\u90fd\u5c5e\u4e8e\u81ea\u76d1\u7763\u5b66\u4e60\u7684\u8303\u7574\uff0c<strong>\u56e0\u4e3a\u5b83\u4eec\u5229\u7528\u672a\u6807\u6ce8\u7684\u5927\u89c4\u6a21\u8bed\u6599\u5e93\uff0c\u901a\u8fc7\u8bbe\u8ba1\u5408\u9002\u7684\u9884\u6d4b\u4efb\u52a1\u6765\u81ea\u4e3b\u751f\u6210\u8bad\u7ec3\u6570\u636e<\/strong>\u3002\u8fd9\u79cd\u65b9\u5f0f\u4e0d\u4ec5\u51cf\u5c11\u4e86\u5bf9\u4eba\u5de5\u6807\u6ce8\u6570\u636e\u7684\u4f9d\u8d56\uff0c\u800c\u4e14\u80fd\u6709\u6548\u5730\u4ece\u5927\u91cf\u65e0\u6807\u6ce8\u6587\u672c\u4e2d\u5b66\u4e60\u8bed\u8a00\u8868\u793a\u3002<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914161926157.png\" style=\"height:400px\">\n<\/p>\n<p><a href=\"https:\/\/arxiv.org\/abs\/1810.04805\">BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding<\/a><\/p>\n<p><a href=\"https:\/\/s3-us-west-2.amazonaws.com\/openai-assets\/research-covers\/language-unsupervised\/language_understanding_paper.pdf\">(GPT-1): Improving Language Understanding by Generative Pre-Training<\/a><\/p>\n<p><a href=\"https:\/\/cdn.openai.com\/better-language-models\/language_models_are_unsupervised_multitask_learners.pdf\">(GPT-2): Language Models are Unsupervised Multitask Learners<\/a><\/p>\n<p><a href=\"https:\/\/arxiv.org\/abs\/2005.14165\">(GPT-3): Language Models are Few-Shot Learners<\/a><\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/plasticine\/100\/000000\/protect-from-magnetic-field.png\" style=\"height:50px;display:inline\">  \u5bf9\u6bd4\u5b66\u4e60\uff08Contrastive Learning\uff09<\/h3>\n<hr \/>\n<p>\u5bf9\u6bd4\u5b66\u4e60\uff08Contrastive Learning\uff09\u662f\u81ea\u76d1\u7763\u5b66\u4e60\uff08Self-Supervised Learning\uff09\u7684\u4e00\u79cd\u5b50\u7c7b\u3002\u81ea\u76d1\u7763\u5b66\u4e60\u662f\u901a\u8fc7\u8bbe\u8ba1\u5408\u9002\u7684\u9884\u6d4b\u4efb\u52a1\uff0c\u4ece\u672a\u6807\u6ce8\u7684\u6570\u636e\u4e2d\u751f\u6210\u8bad\u7ec3\u4fe1\u53f7\u6765\u8fdb\u884c\u5b66\u4e60\u7684\u4e00\u79cd\u65b9\u6cd5\u3002<strong>\u800c\u5bf9\u6bd4\u5b66\u4e60\u901a\u8fc7\u6784\u5efa\u6b63\u8d1f\u6837\u672c\u5bf9\uff0c\u6765\u5b66\u4e60\u6570\u636e\u8868\u793a\u7684\u76f8\u4f3c\u6027\u548c\u5dee\u5f02\u6027<\/strong>\u3002<\/p>\n<ul>\n<li>\u5bf9\u6bd4\u5b66\u4e60\u662f\u4e00\u79cd\u4e3a ML \u6a21\u578b\u5236\u5b9a<strong>\u5bfb\u627e\u76f8\u4f3c\u548c\u4e0d\u76f8\u4f3c\u4e8b\u7269\u7684\u4efb\u52a1\u7684\u65b9\u6cd5<\/strong>\u3002<\/li>\n<li>\u987e\u540d\u601d\u4e49\uff0c\u5bf9\u6bd4\u65b9\u6cd5\u901a\u8fc7\u5bf9\u6bd4<strong>\u6b63\u4f8b\u548c\u8d1f\u4f8b<\/strong>\u6765\u5b66\u4e60\u8868\u793a\u3002<\/li>\n<li>\u4f7f\u7528\u8fd9\u79cd\u65b9\u6cd5\uff0c\u53ef\u4ee5\u8bad\u7ec3\u673a\u5668\u5b66\u4e60\u6a21\u578b\u5bf9\u76f8\u4f3c\u548c\u4e0d\u76f8\u4f3c\u7684\u56fe\u50cf\u8fdb\u884c\u5206\u7c7b\u3002<\/li>\n<li>\u66f4\u6b63\u5f0f\u5730\u8bf4\uff0c\u5bf9\u4e8e\u4efb\u4f55\u6570\u636e\u70b9 $x$\uff0c\u5bf9\u6bd4\u65b9\u6cd5\u65e8\u5728\u5b66\u4e60\u7f16\u7801\u5668 $f$\uff0c\u4f7f\u5f97\uff1a<\/li>\n<\/ul>\n<p>$$score(f(x), f(x^+)) &gt;&gt; score(f(x), f(x^-))$$<\/p>\n<ul>\n<li>\u4e0a\u5f0f\u88ab\u79f0\u4e3a<strong>\u5f97\u5206\u51fd\u6570<\/strong> \u662f\u8861\u91cf\u4e24\u4e2a\u7279\u5f81\u76f8\u4f3c\u5ea6\u7684\u6307\u6807\u3002<\/li>\n<li>$x^+$ \u662f\u4e0e $x$ \u76f8\u4f3c\u7684\u6570\u636e\u70b9\uff0c\u79f0\u4e3a <em>\u6b63\u4f8b<\/em> \u6837\u672c\u3002<\/li>\n<li>$x^\u2212$ \u662f\u4e0e $x$ \u4e0d\u76f8\u4f3c\u7684\u6570\u636e\u70b9\uff0c\u79f0\u4e3a <em>\u8d1f\u4f8b<\/em> \u6837\u672c\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914162032506.png\" style=\"height:200px\">\n<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914162100947.gif\" style=\"height:200px\">\n<\/p>\n<ul>\n<li>\n<p><a href=\"https:\/\/analyticsindiamag.com\/contrastive-learning-self-supervised-ml\">Image Source<\/a><\/p>\n<\/li>\n<li>\n<p>\u6700\u5e38\u89c1\u635f\u5931\u51fd\u6570\u662f<strong>InfoNCE<\/strong>\u635f\u5931\uff0c\u5b83\u770b\u8d77\u6765\u4e0esoftmax\u7c7b\u4f3c\u3002<\/p>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914162145142.png\" style=\"height:100px\">\n<\/p>\n<ul>\n<li>\n<p>$\\mathcal{L}_N$: InfoNCE\u635f\u5931\u51fd\u6570<\/p>\n<\/li>\n<li>\n<p>$\\mathbb{E}_X$: \u5bf9\u6240\u6709\u6837\u672c$x$\u7684\u671f\u671b<\/p>\n<\/li>\n<li>\n<p>$f(x)$: \u8f93\u5165\u6837\u672c$x$\u901a\u8fc7\u795e\u7ecf\u7f51\u7edc\u5f97\u5230\u7684\u8868\u793a\uff08embedding\uff09<\/p>\n<\/li>\n<li>\n<p>$f(x^+)$: \u8f93\u5165\u6837\u672c$x$\u7684\u6b63\u6837\u672c\uff08positive sample\uff09\u7684\u8868\u793a\uff08embedding\uff09\uff0c\u5373\u4e0e$x$\u76f8\u4f3c\u6216\u76f8\u5173\u7684\u6837\u672c<\/p>\n<\/li>\n<li>\n<p>$f(x_j)$: \u8f93\u5165\u6837\u672c$x$\u7684\u8d1f\u6837\u672c\uff08negative samples\uff09\u7684\u8868\u793a\uff08embedding\uff09\uff0c\u5373\u4e0e$x$\u4e0d\u76f8\u4f3c\u6216\u65e0\u5173\u7684\u6837\u672c\uff0c\u5176\u4e2d$j$\u4e3a\u8d1f\u6837\u672c\u7684\u7d22\u5f15<\/p>\n<\/li>\n<li>\n<p>$N$: \u6837\u672c\u5bf9\u7684\u603b\u6570\uff0c\u5305\u62ec1\u4e2a\u6b63\u6837\u672c\u548c$N-1$\u4e2a\u8d1f\u6837\u672c<\/p>\n<\/li>\n<li>\n<p>\u5206\u5b50\u90e8\u5206\u8868\u793a\u6837\u672c$x$\u4e0e\u5176\u6b63\u6837\u672c$x^+$\u7684\u8868\u793a\u5411\u91cf\u7684\u70b9\u79ef\u7684\u6307\u6570\u3002\u70b9\u79ef\u8d8a\u5927\uff0c\u6307\u6570\u503c\u8d8a\u5927\uff0c\u8868\u793a$x$\u4e0e$x^+$\u7684\u76f8\u4f3c\u5ea6\u8d8a\u9ad8\u3002<\/p>\n<\/li>\n<li>\n<p>\u5206\u6bcd\u90e8\u5206\u5305\u62ec\u4e86\u6837\u672c$x$\u4e0e\u6b63\u6837\u672c$x^+$\u7684\u76f8\u4f3c\u5ea6\u6307\u6570\uff0c\u4ee5\u53ca\u6837\u672c$x$\u4e0e\u6240\u6709\u8d1f\u6837\u672c$x_j$\u7684\u76f8\u4f3c\u5ea6\u6307\u6570\u7684\u548c\u3002\u76ee\u7684\u662f\u5c06\u6b63\u6837\u672c\u4e0e\u8d1f\u6837\u672c\u8fdb\u884c\u5bf9\u6bd4\u3002<\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/github.com\/RElbers\/info-nce-pytorch\">InfoNCE Loss in PyTorch<\/a><\/p>\n<\/li>\n<\/ul>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/pastel-glyph\/64\/000000\/qr-code--v2.png\" style=\"height:50px;display:inline\">  Contrastive Predictive Coding (CPC)<\/h3>\n<hr \/>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/1807.03748\"><strong>\u5bf9\u6bd4\u9884\u6d4b\u7f16\u7801 (CPC)<\/strong><\/a> \u901a\u8fc7\u4f7f\u7528\u5f3a\u5927\u7684\u81ea\u56de\u5f52\u6a21\u578b\u5728\u5b66\u4e60\u5230\u7684 <em>\u6f5c\u5728\u7a7a\u95f4<\/em> \u4e2d<strong>\u9884\u6d4b\u672a\u6765<\/strong>\uff0c\u4ece\u800c\u5b66\u4e60\u81ea\u76d1\u7763\u8868\u793a\u3002<\/li>\n<li>\u8be5\u6a21\u578b\u4f7f\u7528\u6982\u7387\u5bf9\u6bd4\u635f\u5931\uff0c\u4ece\u800c\u8bf1\u5bfc\u6f5c\u5728\u7a7a\u95f4\u6355\u83b7<strong>\u5bf9\u9884\u6d4b\u672a\u6765\u6837\u672c\u6700\u6709\u7528<\/strong>\u7684\u4fe1\u606f\u3002<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914162336541.png\" style=\"height:300px\">\n<\/p>\n<p>\u56fe\u793a\u5c55\u793a\u4e86CPC\u7684\u67b6\u6784\uff0c\u901a\u8fc7\u4e00\u4e2a\u5e8f\u5217\u793a\u4f8b\uff08\u5982\u97f3\u9891\u4fe1\u53f7\uff09\u8fdb\u884c\u89e3\u91ca\u3002\u5c3d\u7ba1\u56fe\u4e2d\u4f7f\u7528\u7684\u662f\u97f3\u9891\u8f93\u5165\uff0c\u5b9e\u9645\u4e0a\u540c\u6837\u7684\u8bbe\u7f6e\u53ef\u4ee5\u7528\u4e8e\u56fe\u50cf\u3001\u6587\u672c\u548c\u5f3a\u5316\u5b66\u4e60\u7b49\u3002<\/p>\n<ol>\n<li>\u8f93\u5165\u5e8f\u5217\uff1a<\/li>\n<\/ol>\n<p>\u56fe\u793a\u5e95\u90e8\u7684\u6ce2\u5f62\u8868\u793a\u4e00\u4e2a\u97f3\u9891\u8f93\u5165\u5e8f\u5217\uff0c\u8f93\u5165\u5e8f\u5217\u4e3a${x_{t-3}, x_{t-2}, x_{t-1}, x_t, x_{t+1}, x_{t+2}, x_{t+3}, x_{t+4}}$\u3002<\/p>\n<ol start=\"2\">\n<li>\u7f16\u7801\u5668\uff08$g_{enc}$\uff09\uff1a<\/li>\n<\/ol>\n<p>\u6bcf\u4e2a\u8f93\u5165$x_t$\u901a\u8fc7\u7f16\u7801\u5668$g_{enc}$\u8f6c\u5316\u4e3a\u6f5c\u5728\u8868\u793a$z_t$\u3002\u7f16\u7801\u5668\u5c06\u9ad8\u7ef4\u8f93\u5165\u6570\u636e\u538b\u7f29\u5230\u4f4e\u7ef4\u7684\u6f5c\u5728\u7a7a\u95f4\u3002<\/p>\n<ol start=\"3\">\n<li>\u81ea\u56de\u5f52\u6a21\u578b\uff08$g_{ar}$\uff09\uff1a<\/li>\n<\/ol>\n<p>\u81ea\u56de\u5f52\u6a21\u578b$g_{ar}$\u5bf9\u8fc7\u53bb\u7684\u6f5c\u5728\u8868\u793a\u8fdb\u884c\u5efa\u6a21\uff0c\u751f\u6210\u4e0a\u4e0b\u6587\u8868\u793a$c_t$\u3002\u4e0a\u4e0b\u6587\u8868\u793a$c_t$\u603b\u7ed3\u4e86\u5f53\u524d\u65f6\u95f4\u6b65\u4e4b\u524d\u7684\u6240\u6709\u4fe1\u606f\uff0c\u7528\u4e8e\u9884\u6d4b\u672a\u6765\u3002\u81ea\u56de\u5f52\u6a21\u578b\uff08$g_{ar}$\uff09\u901a\u5e38\u662f\u5faa\u73af\u795e\u7ecf\u7f51\u7edc\uff08RNN\uff0cRecurrent Neural Network\uff09\uff0c\u5982\u957f\u77ed\u671f\u8bb0\u5fc6\u7f51\u7edc\uff08LSTM\uff0cLong Short-Term Memory\uff09\u6216\u95e8\u63a7\u5faa\u73af\u5355\u5143\uff08GRU\uff0cGated Recurrent Unit\uff09\u3002\u8fd9\u4e9b\u6a21\u578b\u53ef\u4ee5\u6709\u6548\u5730\u6355\u6349\u65f6\u95f4\u5e8f\u5217\u6570\u636e\u4e2d\u7684\u4f9d\u8d56\u5173\u7cfb\u3002<\/p>\n<ol start=\"4\">\n<li>\u9884\u6d4b\u672a\u6765\uff1a<\/li>\n<\/ol>\n<p>\u4f7f\u7528\u4e0a\u4e0b\u6587\u8868\u793a$c_t$\u9884\u6d4b\u672a\u6765\u7684\u6f5c\u5728\u8868\u793a$z_{t+k}$\u3002\u56fe\u793a\u4e2d\u5c55\u793a\u4e86\u5bf9\u672a\u6765\u591a\u4e2a\u65f6\u95f4\u6b65\u7684\u9884\u6d4b\u3002\u8fd9\u4e2a\u9884\u6d4b\u7f51\u7edc\u901a\u5e38\u662f\u4e00\u4e2a\u591a\u5c42\u611f\u77e5\u5668\uff08MLP, Multi-Layer Perceptron\uff09\uff0c\u4f46\u4e5f\u53ef\u4ee5\u4f7f\u7528\u5176\u4ed6\u6a21\u578b\u7ed3\u6784\uff0c\u4f8b\u5982\u5377\u79ef\u795e\u7ecf\u7f51\u7edc\uff08CNN\uff09\u6216\u9012\u5f52\u795e\u7ecf\u7f51\u7edc\uff08RNN\uff09\u3002<\/p>\n<ol start=\"5\">\n<li>\u5bf9\u6bd4\u635f\u5931\uff1a<\/li>\n<\/ol>\n<p>CPC\u901a\u8fc7\u5bf9\u6bd4\u635f\u5931\u51fd\u6570\u6765\u8fdb\u884c\u8bad\u7ec3\uff0c\u8be5\u635f\u5931\u51fd\u6570\u4f7f\u5f97\u6b63\u786e\u7684\u672a\u6765\u9884\u6d4b\u5728\u8868\u793a\u7a7a\u95f4\u4e2d\u66f4\u63a5\u8fd1\uff0c\u800c\u4e0d\u76f8\u5173\u7684\u9884\u6d4b\u66f4\u8fdc\u79bb\u3002\u5177\u4f53\u6765\u8bf4\uff0c\u6b63\u4f8b\u53ef\u4ee5\u662f\u672a\u6765\u65f6\u95f4\u6b65 $t+1$ \u548c $t+2$ \u7684\u8868\u793a\uff0c\u5373 $f\\left(x_4\\right)$ \u548c $f\\left(x_5\\right)$ \u3002\u53cd\u4f8b\u53ef\u4ee5\u662f\u4e0e\u5f53\u524d\u65f6\u95f4\u6b65\u65e0\u5173\u7684\u5176\u4ed6\u65f6\u95f4\u6b65\u7684\u8868\u793a\uff0c\u5982 $f\\left(x_1\\right), f\\left(x_2\\right)$\u6216\u8005\u4ece\u5176\u4ed6\u5e8f\u5217\u4e2d\u968f\u673a\u9009\u53d6\u7684\u8868\u793a\u3002<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/jefflai108\/Contrastive-Predictive-Coding-PyTorch\">PyTorch \u4ee3\u7801<\/a><\/li>\n<\/ul>\n<p><strong>\u8bad\u7ec3\u540e\u7684\u5e94\u7528<\/strong><br \/>\n\u8bad\u7ec3\u5f97\u5230\u7684\u8fd9\u4e9b\u6a21\u578b\u7ec4\u4ef6\u53ef\u4ee5\u7528\u4e8e\u4ee5\u4e0b\u51e0\u4e2a\u65b9\u9762\uff1a<\/p>\n<p><strong>\u7279\u5f81\u63d0\u53d6\uff1a<\/strong><\/p>\n<p>\u7f16\u7801\u5668\uff1a\u53ef\u4ee5\u7528\u4f5c\u7279\u5f81\u63d0\u53d6\u5668\uff0c\u5c06\u8f93\u5165\u6570\u636e\u6620\u5c04\u5230\u6f5c\u5728\u7a7a\u95f4\u3002\u8fd9\u4e9b\u6f5c\u5728\u8868\u793a\u53ef\u4ee5\u4f5c\u4e3a\u8f93\u5165\u7279\u5f81\u7528\u4e8e\u5176\u4ed6\u673a\u5668\u5b66\u4e60\u6a21\u578b\uff0c\u5982\u5206\u7c7b\u5668\u3001\u56de\u5f52\u5668\u7b49\u3002\u4f8b\u5982\uff0c\u5728\u56fe\u50cf\u5206\u7c7b\u4efb\u52a1\u4e2d\uff0c\u53ef\u4ee5\u5c06\u56fe\u50cf\u901a\u8fc7\u7f16\u7801\u5668\u751f\u6210\u7279\u5f81\u8868\u793a\uff0c\u7136\u540e\u7528\u8fd9\u4e9b\u7279\u5f81\u8fdb\u884c\u5206\u7c7b\u3002<\/p>\n<p><strong>\u4e0b\u6e38\u4efb\u52a1\uff1a<\/strong><\/p>\n<p>\u7f16\u7801\u5668\u548c\u81ea\u56de\u5f52\u6a21\u578b\uff1a\u53ef\u4ee5\u7528\u4e8e\u65f6\u95f4\u5e8f\u5217\u9884\u6d4b\u3001\u751f\u6210\u4efb\u52a1\u4ee5\u53ca\u5f02\u5e38\u68c0\u6d4b\u3002\u4f8b\u5982\uff0c\u5728\u91d1\u878d\u65f6\u95f4\u5e8f\u5217\u9884\u6d4b\u4e2d\uff0c\u53ef\u4ee5\u4f7f\u7528\u7f16\u7801\u5668\u751f\u6210\u7684\u7279\u5f81\u8868\u793a\u548c\u81ea\u56de\u5f52\u6a21\u578b\u751f\u6210\u7684\u4e0a\u4e0b\u6587\u8868\u793a\u8fdb\u884c\u672a\u6765\u4ef7\u683c\u9884\u6d4b\u3002<\/p>\n<p>\u7f16\u7801\u5668\u548c\u9884\u6d4b\u6a21\u578b\uff1a\u53ef\u4ee5\u7528\u4e8e\u751f\u6210\u6a21\u578b\u548c\u589e\u5f3a\u5b66\u4e60\u4efb\u52a1\uff0c\u4f8b\u5982\u5728\u81ea\u7136\u8bed\u8a00\u5904\u7406\u4efb\u52a1\u4e2d\uff0c\u53ef\u4ee5\u751f\u6210\u4e0b\u4e00\u53e5\u6216\u4e0b\u4e00\u6bb5\u6587\u672c\u3002<\/p>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/nolan\/64\/collapse-arrow.png\" style=\"height:50px;display:inline\">  Simple Framework for Contrastive Learning of Visual Representations (SimCLR)<\/h3>\n<hr \/>\n<ul>\n<li><a href=\"https:\/\/arxiv.org\/abs\/2002.05709\"><strong>Simple Framework for Contrastive Learning of Visual Representations (SimCLR)<\/strong><\/a> is a framework for contrastive learning of <em>visual<\/em> representations. <\/li>\n<li>It learns representations by maximizing agreement between differently augmented views of the same data example via a contrastive loss in the latent space.<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914163010325.png\" style=\"height:300px\">\n<\/p>\n<ol>\n<li>\u6570\u636e\u589e\u5f3a:\n<ul>\n<li>\u5bf9\u539f\u59cb\u56fe\u50cf $x$ \u8fdb\u884c\u4e24\u6b21\u4e0d\u540c\u7684\u968f\u673a\u6570\u636e\u589e\u5f3a\uff0c\u751f\u6210\u4e24\u4e2a\u589e\u5f3a\u89c6\u56fe $\\tilde{x}_i$ \u548c $\\tilde{x}_j$\u3002<\/li>\n<li>\u589e\u5f3a\u64cd\u4f5c\u7531\u589e\u5f3a\u51fd\u6570\u96c6\u5408 $\\mathcal{T}$ \u751f\u6210\u3002<\/li>\n<\/ul>\n<\/li>\n<li>\u7f16\u7801\u5668:\n<ul>\n<li>\u7ecf\u8fc7\u589e\u5f3a\u540e\u7684\u56fe\u50cf $\\tilde{x}_i$  \u548c $\\tilde{x}_j$ \u5206\u522b\u901a\u8fc7\u7f16\u7801\u5668 $f(\\cdot)$ \u751f\u6210\u8868\u793a  $h_i$ \u548c $h_j$ \u3002<\/li>\n<\/ul>\n<\/li>\n<li>\u6295\u5f71\u5934:\n<ul>\n<li>\u8868\u793a $h_i$ \u548c $h_j$ \u8fdb\u4e00\u6b65\u901a\u8fc7\u6295\u5f71\u5934 $g(\\cdot)$ \u751f\u6210\u6f5c\u5728\u7a7a\u95f4\u4e2d\u7684\u8868\u793a $z_i$ \u548c  $z_j$  \u3002<\/li>\n<\/ul>\n<\/li>\n<li>\u5bf9\u6bd4\u5b66\u4e60:\n<ul>\n<li>\u901a\u8fc7\u5bf9\u6bd4\u635f\u5931\uff08\u5982InfoNCE\uff09\u6700\u5927\u5316 $z_i$ \u548c  $z_j$  \u4e4b\u95f4\u7684\u4e00\u81f4\u6027\uff0c\u6700\u5c0f\u5316$z_i$ \u4e0e\u5176\u4ed6\u6837\u672c\u8868\u793a\u4e4b\u95f4\u7684\u76f8\u4f3c\u5ea6\u3002<\/li>\n<li>\u635f\u5931\u51fd\u6570\u7684\u76ee\u6807\u662f\u4f7f\u76f8\u540c\u6570\u636e\u793a\u4f8b\u7684\u589e\u5f3a\u89c6\u56fe\u5728\u6f5c\u5728\u7a7a\u95f4\u4e2d\u7684\u8868\u793a\u66f4\u63a5\u8fd1\uff0c\u800c\u4e0d\u540c\u6570\u636e\u793a\u4f8b\u7684\u8868\u793a\u5219\u66f4\u8fdc\u79bb\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914163649159.gif\" style=\"height:350px\">\n<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/sthalles\/SimCLR\">PyTorch Code<\/a><\/li>\n<li><a href=\"https:\/\/colab.research.google.com\/github\/rll\/deepul\/blob\/master\/demos\/lecture7_selfsupervised_demos.ipynb#scrollTo=YB_cqJagEXbw\">Colab Example<\/a><\/li>\n<\/ul>\n<h3><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/external-gradients-pongsakorn-tan\/64\/null\/external-clip-gdpr-gradients-pongsakorn-tan.png\" style=\"height:50px;display:inline\">  CLIP - Contrastive Language\u2013Image Pre-training<\/h3>\n<hr \/>\n<p>CLIP\u662f\u4e00\u79cd\u7531OpenAI\u5f00\u53d1\u7684\u795e\u7ecf\u7f51\u7edc\u6a21\u578b\uff0c\u65e8\u5728\u901a\u8fc7\u81ea\u7136\u8bed\u8a00\u76d1\u7763\u9ad8\u6548\u5730\u5b66\u4e60\u89c6\u89c9\u6982\u5ff5\u3002CLIP\u80fd\u591f\u5728\u6ca1\u6709\u4e13\u95e8\u8bad\u7ec3\u7684\u60c5\u51b5\u4e0b\uff0c\u7406\u89e3\u5e76\u5206\u7c7b\u56fe\u50cf\u4e2d\u7684\u5185\u5bb9\u3002<\/p>\n<p>CLIP\u7684\u5de5\u4f5c\u8fc7\u7a0b\u53ef\u4ee5\u5206\u4e3a\u4e09\u4e2a\u4e3b\u8981\u6b65\u9aa4\uff1a<\/p>\n<p><strong>1. \u5bf9\u6bd4\u9884\u8bad\u7ec3\uff08Contrastive Pre-training\uff09<\/strong><\/p>\n<ul>\n<li>\u6570\u636e\u6536\u96c6\uff1aCLIP\u4f7f\u7528\u6765\u81ea\u4e92\u8054\u7f51\u7684\u5927\u89c4\u6a21\u6570\u636e\u96c6\uff0c\u8fd9\u4e9b\u6570\u636e\u96c6\u5305\u542b\u4e86\u56fe\u50cf\u548c\u4e0e\u4e4b\u914d\u5bf9\u7684\u6587\u672c\u63cf\u8ff0\u3002\u4f8b\u5982\uff0c\u4e00\u5f20\u72d7\u7684\u7167\u7247\u53ef\u80fd\u4f1a\u914d\u6709\u201c\u8fd9\u662f\u4e00\u53ea\u72d7\u201d\u7684\u6587\u5b57\u8bf4\u660e\u3002<\/li>\n<li>\u56fe\u50cf\u7f16\u7801\u5668\u548c\u6587\u672c\u7f16\u7801\u5668\uff1a\u5728\u9884\u8bad\u7ec3\u9636\u6bb5\uff0cCLIP\u4f1a\u540c\u65f6\u8bad\u7ec3\u4e00\u4e2a\u56fe\u50cf\u7f16\u7801\u5668\u548c\u4e00\u4e2a\u6587\u672c\u7f16\u7801\u5668\u3002\u56fe\u50cf\u7f16\u7801\u5668\u5c06\u56fe\u50cf\u8f6c\u6362\u4e3a\u4e00\u7ec4\u7279\u5f81\u5411\u91cf\uff0c\u6587\u672c\u7f16\u7801\u5668\u5c06\u6587\u672c\u8f6c\u6362\u4e3a\u53e6\u4e00\u7ec4\u7279\u5f81\u5411\u91cf\u3002<\/li>\n<li>\u5bf9\u6bd4\u5b66\u4e60\uff1a\u901a\u8fc7\u5bf9\u6bd4\u5b66\u4e60\uff0cCLIP\u4f1a\u5b66\u4e60\u5230\u54ea\u4e9b\u56fe\u50cf\u548c\u6587\u672c\u662f\u6b63\u786e\u914d\u5bf9\u7684\u3002\u7b80\u5355\u6765\u8bf4\uff0c\u6a21\u578b\u4f1a\u5c3d\u91cf\u4f7f\u6b63\u786e\u914d\u5bf9\u7684\u56fe\u50cf\u548c\u6587\u672c\u7684\u7279\u5f81\u5411\u91cf\u76f8\u4f3c\u5ea6\u66f4\u9ad8\uff0c\u800c\u9519\u8bef\u914d\u5bf9\u7684\u76f8\u4f3c\u5ea6\u66f4\u4f4e\u3002<\/li>\n<\/ul>\n<p><strong>2. \u4ece\u6807\u7b7e\u6587\u672c\u521b\u5efa\u6570\u636e\u96c6\u5206\u7c7b\u5668\uff08Create Dataset Classifier from Label Text\uff09<\/strong><\/p>\n<ul>\n<li>\u6587\u672c\u63cf\u8ff0\u8f6c\u5316\uff1a\u5728\u5b8c\u6210\u9884\u8bad\u7ec3\u4e4b\u540e\uff0cCLIP\u53ef\u4ee5\u5c06\u5404\u79cd\u7c7b\u522b\u7684\u540d\u79f0\uff08\u6807\u7b7e\uff09\u8f6c\u5316\u4e3a\u81ea\u7136\u8bed\u8a00\u63cf\u8ff0\u3002\u4f8b\u5982\uff0c\u5c06\u201c\u72d7\u201d\u8fd9\u4e2a\u6807\u7b7e\u8f6c\u5316\u4e3a\u201c\u72d7\u7684\u7167\u7247\u201d\u3002<\/li>\n<li>\u6587\u672c\u7f16\u7801\uff1a\u8fd9\u4e9b\u81ea\u7136\u8bed\u8a00\u63cf\u8ff0\u901a\u8fc7\u6587\u672c\u7f16\u7801\u5668\u8f6c\u6362\u4e3a\u7279\u5f81\u5411\u91cf\uff0c\u8fd9\u4e9b\u5411\u91cf\u4e0e\u56fe\u50cf\u7f16\u7801\u5668\u8f93\u51fa\u7684\u56fe\u50cf\u7279\u5f81\u5411\u91cf\u8fdb\u884c\u5bf9\u6bd4\u3002<\/li>\n<\/ul>\n<p><strong>3. \u7528\u4e8e\u96f6\u6837\u672c\u9884\u6d4b\uff08Use for Zero-shot Prediction\uff09<\/strong><\/p>\n<ul>\n<li>\n<p>\u96f6\u6837\u672c\u5206\u7c7b\uff1a\u5728\u63a8\u7406\u9636\u6bb5\uff08\u5373\u5b9e\u9645\u5e94\u7528\u4e2d\uff09\uff0cCLIP\u53ef\u4ee5\u8fdb\u884c\u96f6\u6837\u672c\u5206\u7c7b\u3002\u8fd9\u610f\u5473\u7740\uff0c\u5373\u4f7f\u6a21\u578b\u6ca1\u6709\u89c1\u8fc7\u7279\u5b9a\u7684\u56fe\u50cf\u7c7b\u522b\uff0c\u53ea\u8981\u7ed9\u5b9a\u4e00\u4e2a\u6587\u672c\u63cf\u8ff0\uff08\u4f8b\u5982\u201c\u732b\u7684\u7167\u7247\u201d\uff09\uff0c\u6a21\u578b\u5c31\u53ef\u4ee5\u9884\u6d4b\u56fe\u50cf\u662f\u5426\u7b26\u5408\u8be5\u63cf\u8ff0\u3002<\/p>\n<\/li>\n<li>\n<p>\u914d\u5bf9\u9884\u6d4b\uff1aCLIP\u4f1a\u4e3a\u6bcf\u5f20\u56fe\u50cf\u8ba1\u7b97\u4e0e\u6240\u6709\u6587\u672c\u63cf\u8ff0\u7684\u76f8\u4f3c\u5ea6\uff0c\u9009\u62e9\u76f8\u4f3c\u5ea6\u6700\u9ad8\u7684\u6587\u672c\u63cf\u8ff0\u4f5c\u4e3a\u9884\u6d4b\u7ed3\u679c\u3002\u4f8b\u5982\uff0c\u5bf9\u4e8e\u4e00\u5f20\u65b0\u7684\u52a8\u7269\u7167\u7247\uff0cCLIP\u53ef\u4ee5\u5224\u65ad\u5b83\u66f4\u50cf\u662f\u201c\u72d7\u7684\u7167\u7247\u201d\u8fd8\u662f\u201c\u732b\u7684\u7167\u7247\u201d\u3002<\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/github.com\/openai\/CLIP\">Official Repository (PyTorch)<\/a><\/p>\n<\/li>\n<li>\n<p><a href=\"https:\/\/colab.research.google.com\/github\/openai\/CLIP\/blob\/main\/notebooks\/Interacting_with_CLIP.ipynb\">Colab Example - Interaction with CLIP and Zero-Shot Classification<\/a><\/p>\n<\/li>\n<li>\n<p>HuggingFace Demos:<\/p>\n<ul>\n<li><a href=\"https:\/\/huggingface.co\/openai\/clip-vit-large-patch14\">CLIP-ViT-Large<\/a><\/li>\n<li><a href=\"https:\/\/huggingface.co\/spaces\/taesiri\/CLIPScore\">CLIPScore<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914163755519.png\" style=\"height:600px\">\n<\/p>\n<p align=\"center\">\n  <img decoding=\"async\" src=\"https:\/\/gnnclub-1311496010.cos.ap-beijing.myqcloud.com\/wp-content\/uploads\/2024\/09\/20240914163948530.png\" style=\"height:250px\">\n<\/p>\n<h4>CLIP Extensions<\/h4>\n<hr \/>\n<ul>\n<li><a href=\"https:\/\/github.com\/yzhuoning\/Awesome-CLIP\">Awesome-CLIP<\/a> - a repository that collects CLIP-based applications<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/2107.07651\">ALBEF - Align Before Fuse<\/a>\n<ul>\n<li><a href=\"https:\/\/github.com\/salesforce\/ALBEF\/\">PyTorch Code<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/2204.05991\">ReCLIP - A Strong Zero-Shot Baseline for Referring Expression Comprehension<\/a>\n<ul>\n<li><a href=\"https:\/\/github.com\/allenai\/reclip\">PyTorch Code<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/arxiv.org\/abs\/2303.15343\">SigLIP - Sigmoid Loss for Language Image Pre-Training<\/a>\n<ul>\n<li><a href=\"https:\/\/huggingface.co\/docs\/transformers\/main\/en\/model_doc\/siglip\">Model on HF<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/github.com\/mlfoundations\/open_clip\">OpenCLIP - an open source implementation of CLIP<\/a><\/li>\n<\/ul>\n<h2><img decoding=\"async\" src=\"https:\/\/img.icons8.com\/stickers\/100\/null\/prize.png\" style=\"height:50px;display:inline\"> Credits<\/h2>\n<hr \/>\n<ul>\n<li>Icons made by <a href=\"https:\/\/www.flaticon.com\/authors\/becris\" title=\"Becris\">Becris<\/a> from <a href=\"https:\/\/www.flaticon.com\/\" title=\"Flaticon\">www.flaticon.com<\/a><\/li>\n<li>Icons from <a href=\"https:\/\/icons8.com\/\">Icons8.com<\/a> - <a href=\"https:\/\/icons8.com\">https:\/\/icons8.com<\/a><\/li>\n<li><a href=\"https:\/\/ruder.io\/transfer-learning\/\">Sebastian Ruder - Transfer Learning - Machine Learning's Next Frontier<\/a><\/li>\n<li><a href=\"https:\/\/ai.googleblog.com\/2018\/11\/open-sourcing-bert-state-of-art-pre.html\">Jacob Devlin and Ming-Wei Chang - Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing<\/a><\/li>\n<li><a href=\"https:\/\/sites.google.com\/view\/berkeley-cs294-158-sp20\/home\">CS294-158-SP20-Deep Unsupervised Learning<\/a><\/li>\n<li><a href=\"https:\/\/paperswithcode.com\/method\/contrastive-predictive-coding\">Contrastive Predictive Coding<\/a><\/li>\n<li><a href=\"https:\/\/paperswithcode.com\/method\/simclr\"> Simple Framework for Contrastive Learning of Visual Representations (SimCLR)<\/a><\/li>\n<li><a href=\"https:\/\/leimao.github.io\/blog\/Exponential-Moving-Average\/\">Exponential Moving Average<\/a><\/li>\n<li><a href=\"https:\/\/paperswithcode.com\/method\/moco\">Momentum Contrast<\/a><\/li>\n<li><a href=\"https:\/\/untitled-ai.github.io\/understanding-self-supervised-contrastive-learning.html\">Understanding self-supervised and contrastive learning with &quot;Bootstrap Your Own Latent&quot; (BYOL)<\/a><\/li>\n<li><a href=\"https:\/\/creatis-myriad.github.io\/2022\/06\/01\/EmergingPropertiesSSViT.html\">MYRIAD - Emerging Properties in Self-Supervised Vision Transformers<\/a><\/li>\n<li><a href=\"https:\/\/openai.com\/blog\/clip\/\">OpenAI - CLIP: Connecting Text and Images<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Deep Learning create by Arwin Yu Tutorial 04 &#8211; Encoder- [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1933,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18,24],"tags":[19],"class_list":["post-1912","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-18","category-24","tag-19"],"_links":{"self":[{"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1912","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1912"}],"version-history":[{"count":17,"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1912\/revisions"}],"predecessor-version":[{"id":1975,"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/posts\/1912\/revisions\/1975"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/www.gnn.club\/index.php?rest_route=\/wp\/v2\/media\/1933"}],"wp:attachment":[{"href":"http:\/\/www.gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1912"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1912"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.gnn.club\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1912"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}