Recently, nature published a new research by IBM. The "all optical" deep neural network built with optical devices can be more energy-efficient than traditional computing methods, and has the advantages of scalability, no photoelectric conversion and high bandwidth. This discovery may lay a foundation for the emergence of optical neural network accelerator in the future.Optical fiber can transmit data all over the world in the form of light, which has become the pillar of modern telecommunications technology. However, if these transmission data need to be analyzed, they should be converted from optical signals to electronic signals, and then processed by electronic equipment. For some time, optics was considered as the basis of the most potential computing technology in the future, but compared with the rapid progress of electronic computers, the competitiveness of optical computing technology is obviously insufficient.However, in the past few years, the industry has paid more and more attention to the cost of computing energy. Therefore, optical computing system has attracted more and more attention again. Optical computing has low energy consumption and can be used as special acceleration hardware for AI algorithms, such as deep neural network (DNN). Recently, Feldmann and others published the latest progress of this "all optical network implementation" in the journal Nature.
Deep neural network includes multilayer artificial neurons and artificial synapses. The strength of these connections, called network weights, can be positive, indicating neuronal excitation, or negative, indicating neuronal inhibition. The network will try its best to minimize the difference between the actual output and the expected output, so as to change the weight of synapses to perform tasks such as image recognition.CPU and other hardware accelerators are usually used for DNN calculation. The training of DNN can use the known data set, and the trained DNN can be used to infer the unknown data in the task. Although the amount of calculation is large, the diversity of calculation operations will not be very high, because the "multiplication and accumulation" operation is dominant in many synaptic weights and neuronal excitation.DNN can still work normally when the calculation accuracy is low. Therefore, DNN networks represent potential opportunities for non-traditional computing technologies. Researchers are trying to build a DNN accelerator based on new nonvolatile memory devices. This kind of equipment can also save information when cutting off the power supply, and improve the speed and energy efficiency of DNN through analog electronic computing.
So why not consider using optics? Light guiding components can contain a large amount of data - whether optical fibers for telecommunications or waveguides on photonic chips. In this kind of waveguide, the "wavelength division multiplexing" technology can be used to allow many different wavelengths of light to propagate together. Each wavelength can then be modulated (changed in a way that can carry information) at a rate limited by the available bandwidth related to electron to optical modulation and optoelectronic detection.Fig. 1 all optical pulse neuron circuitThe use of resonators enables the addition or removal of a single wavelength, just like the loading and unloading of trucks. The synaptic weight array of DNN network can be constructed by using micron ring resonator. The resonator can be thermally modulated, electro-optic modulated, or modulated by phase change materials. These materials can switch between amorphous phase and crystalline phase, and the light absorption capacity of different materials varies greatly. Under ideal conditions, the power consumption of multiplication and accumulation is very low.
Feldmann's research team has implemented "all optical neural network" on millimeter photonic chip, in which photoelectric conversion is not used in the network. The input data is electronically modulated to different wavelengths and injected into the network, but then all the data remains on the chip. Integrated phase change materials are used to adjust synaptic weight and integrate neurons.Fig. 2 pulse generation and operation of artificial neuronThe authors demonstrate supervised and unsupervised learning on a small scale - that is, training using labeled data (DNN learning) and training using unlabeled data (similar to human learning).
Fig. 3 supervised learning and unsupervised learning based on phase change all optical neuron systemBecause the weight expression is realized by light absorption, the negative weight requires a larger bias signal, which cannot activate the phase change material. An alternative method is to use the device of Mach Zehnder interferometer to divide a single waveguide into two arms and then recombine them. At this time, the amount of transmitted light depends on the difference of optical phase between the two propagation paths. However, it may be difficult to combine this method with wavelength division multiplexing, because the arm of each interferometer needs to introduce an appropriate phase difference for each wavelength.All optical implementation of DNN still faces major challenges. Ideally, their total power utilization may be low, and thermooptical power is often required to adjust and maintain the optical phase difference in each Mach Zehnder interferometer arm.
Fig. 4 scalable architecture of all optical neural networkIn addition, the total optical power injected into the system containing phase change materials must be carefully calibrated to make the response of the materials to the input signal meet the expectations. Although phase change materials can also be used to adjust Mach Zehnder phase, there will be inevitable cross coupling between the intensity of light absorbed by materials and slowing down the speed of light, which will increase the complexity of the system.The traditional DNN has developed to a large scale, which may contain thousands of neurons and millions of synapses. However, the waveguides of photonic networks need to be far away from each other to prevent coupling and avoid sharp bending to prevent light from leaving the waveguide. Because the intersection of the two waveguides may inject unwanted power into the wrong path, which substantially limits the 2D characteristics of photonic chip design.
Fig. 5 experimental implementation of single-layer pulse neural networkIt takes a long distance and a large area to build a neural network for optical devices, but the manufacturing of key parts of each optical structure needs high precision. This is because the waveguide and coupling region, such as the inlet and outlet of each microring resonator, must reach the exact size required for the corresponding network performance. There are also many limitations on how to manufacture small microring resonators.Finally, modulation technology provides a weak optical effect and requires a long interaction region to achieve a significant level of limited influence on light passing through.
The progress made in the research of Feldmann team is expected to promote the future development of this field. This research may lay a foundation for the emergence of energy-efficient and scalable optical neural network accelerators in the future.
Shenzhen TigerWong Technology Co.,Ltd
Tel: +86 13717037584
E-Mail: info@sztigerwong.com
Add: 1st Floor, Building A2, Silicon Valley Power Digital Industrial Park, No. 22 Dafu Road, Guanlan Street, Longhua District,
Shenzhen,GuangDong Province,China