Contents

Positional encoding In Transformer

Positional encoding in Transformer

Positional encoding in Transformer

criteria

Ideally, the following criteria should be satisfied
  • It should output a unique encoding for each time-step
  • Distance between any two time-steps should be consistent across sentences with different lengths
  • out model should generalize to longer sentences without any efforts. Its values should be bounded
  • It must be deterministic

Proposed method

\[ P(k, 2i) = \sin(\frac{k}{n^{2i/d}}) \] \[ P(k, 2i + 1) = \cos(\frac{k}{n^{2i/d}}) \] k: position of an object input sequence, \(0 \leq k < L/2\) d: Dimension of the output embedding space P(K, j): Position function for mapping a position k in the input sequence to index(k, j) of the positional matrix) n: User defined scalar. Set to 10000 by the authors of Attention Is All You Need i: Used for mapping to column indices \(0 \leq i < d/2\). A single value of i maps to both sine and cosine functions.

Example

Figure 1: Positional encoding

Figure 1: Positional encoding

Positional encoding matrix

import sys
import readline
import torch
import readline
import numpy as np
import matplotlib.pyplot as plt
def getPositionEncoding(seq_len, d, n=10000):
    P = np.zeros((seq_len, d))
    for k in range(seq_len):
        for i in np.arange(int(d/2)):
            denominator = np.power(n, 2*i/d)
            P[k, 2*i] = np.sin(k/denominator)
            P[k, 2*i+1] = np.cos(k/denominator)
    return P
P = getPositionEncoding(seq_len=4, d=4, n=100)
print(P)
print(readline.__doc__)
[[ 0.          1.          0.          1.        ]
 [ 0.84147098  0.54030231  0.09983342  0.99500417]
 [ 0.90929743 -0.41614684  0.19866933  0.98006658]
 [ 0.14112001 -0.9899925   0.29552021  0.95533649]]
Importing this module enables command line editing using GNU readline.

sine wave for different positions with n = 10000 and d = 512

def plotSinusoid(k, d=512, n=10000):
    x = np.arange(0, 100, 1)
    denominator = np.power(n, 2*x/d)
    y = np.sin(k/denominator)
    plt.plot(x, y)
    plt.title("k=" + str(k))
fig = plt.figure(figsize=(15, 4))
for i in range(4):
    plt.subplot(141 + i)
    plotSinusoid(i*4)
fname = 'images/sinusoid.png'
plt.savefig(fname)
fname