CYSMICS

With Global Threat
Comes Global Responsibility

Our mission

CYSMICS is a joint collaborative center between the Cybersecurity Research Center (CYSEC) at TU Darmstadt, Germany, and the Center for Machine-Integrated Computing & Security (MICS) at UC San Diego, US. Our mission is to tackle current and upcoming challenges in machine-learning-based automated systems, privacy, and security, including scalability issues, IP protection and adversarial machine learning.

As malicious users have increasing incentives to trick machine learning algorithms, we develop new defenses throughout all layers: algorithm design, software, and underlying hardware. Moreover, as popularity in AI spikes and competition rapidly grows, IP protection for pre-trained machine learning models is of unprecedented importance. The rise of embedded and IoT (Internet of Things) devices poses an additional challenge to the development of lightweight secure systems powered by machine learning.

Selected projects

Challenges in Trusted Execution Environments

The recent increase in the number and sophistication of cyber attacks underlines the need of trusted components in computation platforms. To address this shortcoming, the popularity of Trusted Execution Environments (TEEs) has recently increased vastly. TEEs allow to protect security-critical components of an applications from other potentially compromised components.

Our TEE research focuses both on the prominent industry solutions, like Intel Software Guard Extensions (SGX) and ARM TrustZone, and on research for next-generation trusted execution platforms. We started with securing real-world applications, including web browsers and speech recognition engines, against real-world attacks using Intel SGX. We also developed defenses against common side-channel attacks against SGX enclaves.

In comparison, ARM TrustZone access is heavily restricted and companies like Google are further locking down access to reduce the possibility of a TrustZone breach. Consequently, we investigated ways to bring openly-accessible enclaves to ARM platforms in our Sanctuary project, which allows isolation of security-critical code in an unprivileged environment. However, designing a TEE on a closed platform is challenging and requires compromises on functionality and security. Hence, we are also working on a TEE for RISC-V, where we can start from scratch: by leveraging open-source SoC implementations we can design a TEE that comes without drawbacks for tomorrow’s devices.

Scalable Secure Function Evaluation

Secure function evaluation (SFE) refers to the grand challenge of how two or more parties can correctly compute a joint function of their respective private inputs without exposure. The collaborative work of MICS and TU Darmstadt researchers have focused on a number of technologies to achieve such privacy-preserving SFE computing, focusing on provable protection, scalability, and practical implementation. A particular focus area is on privacy-preserving computing using the Garbled Circuit (GC) technique, a generic approach to secure two-party computation for semi-honest participants. The GC protocol was developed by Yao in the mid 1980s, but has been largely viewed as being of limited practical significance due to its inefficiency. Despite several important advances in the past 3 decades, prior to the joint MICS-Darmstadt work the available methods suffer from unscalability and lack of global logic optimization. Read more

Our joint work on this topic, called TinyGarble, has introduced a paradigm shift in the GC protocol implementation by creation of a compact and scalable Boolean garbled description format. A unique aspect of their solution is that it views the circuit generation for GC as an atypical logic synthesis task. By properly defining the synthesis problem and using our newly created custom libraries and transformations, they show how logic synthesis can be adopted for elegantly addressing this problem. The proposed methods improve the results of the best known automatic tool for GC generation by several orders of magnitude, enabling embedded implementations. The findings also enable a spectrum of novel concepts and applications, including the first scalable implementation of a Garbled CPU (prototyped on FPGA) that is provably leakage-resilient in non-interactive settings, the first efficient/scalable secure stable matching, as well as privacy-preserving nearest neighbor and location searches.

Our most recent collaborative work on this important topic of privacy-preserving computing introduces game-changing benefits by providing a higher level of abstraction for the input functions. Wide-scale adaptation of TinyGarble required writing the functions in Verilog or VHDL, standard hardware design languages. While these hardware level description languages are very well known to the hardware and design communities, the input format restricted the wider software community from adopting the benefits offered by TinyGarble. More specifically, our recent work called ARM2GC, enables the security and software researchers to express the functions in any language that can be compiled into ARM, while they achieve almost the same efficiency as inputs into the hardware description language.

Privacy-preserving machine learning

Our joint center has the unique capability to compute and execute state-of-the-art deep learning models on encrypted data in a provablely secure, scalable, and rapid way, without losing the accuracy. In particular, for the popular class of deep learning (DL) models, we have made significant strides. The applicability of DL models, however, is hindered in settings where the risk of data leakage raises serious privacy concerns. Examples of such applications include scenarios where clients hold sensitive private information, e.g., medical records, financial data, or location. The solutions developed by our joint center cover a spectrum of techniques, including single protocols such as Garbled Circuits (GC) or mixed protocols such as combined secret sharing, GC, and or homomorphoic encryption. Read more

As an example, our recent work in this area, called DeepSecure, provides the first provably-secure framework for scalable DL-based analysis of data collected by distributed clients. DeepSecure enables applying the state-of-the-art DL models on sensitive data without sacrificing the accuracy to obtain security. Consistent with the literature, we assume an honest- but-curious adversary model for a generic case where both DL parameters and input data must be kept private. DeepSecure proposes the use of Yao's Garbled Circuit (GC) protocol to securely perform DL execution. In contrast with the prior work based on Homomorphic encryption such as Microsoft CryptoNets, our methodology does not involve a trade-off between utility and privacy. We show that our framework is the best choice for scenarios in which the number of samples collected by each distributed client is less than 2600 samples; the clients send the data to the server for processing with the least possible delay. Our approach is well-suited for streaming settings where clients need to dynamically analyze their data as it is collected over time without having to queue the samples to meet a certain batch size. Our optimized implementation achieves more than 58-fold higher throughput per sample compared with the best prior solution. In addition to our optimized GC realization, we introduce a set of novel low-overhead pre-processing techniques which further reduce the GC overall runtime in the context of deep learning. Extensive evaluations of various DL applications demonstrate up to two orders-of-magnitude additional runtime improvement achieved as a result of our pre-processing methodology. We also provide mechanisms to securely delegate GC computations to a third party in constrained embedded settings.

Safe Machine Learning by Characterizing and Defending the space of Adversarial attacks

The increasing usage of machine learning, and in particular the popular class of deep learning (DL) models is creating incentives for malicious users to attack DL models by crafting adversarial samples. The adversarial samples refer to the inputs created by malicious users with the intent of misleading DL networks. Studying adversarial samples is particularly important from two perspectives. One the one hand, it reveals the weaknesses of machine learning models/classifiers commonly used in autonomous systems. On the other hand, it provides an opportunities to develop DL algorithms that yield better generalization and robustness. The existing works in the literature have been mainly focused on devising integrity attack models with little or no attention to the possible countermeasures to reduce the DL susceptibility to adversarial samples. Read more

The focus of our center is on three separate but inter-linked thrusts related to the adversarial samples:

  • Designing a context-aware integrity attack model to craft adversarial samples based on a precise understanding of both DL model topology and input/output data structure. We leverage the composition of data as an ensemble of lower-dimensional subspaces to identify/restrain the effective adversarial space with respect to the data context and learning task. We evaluate and compare the success of the proposed attack model against existing methodologies in terms of the detectability of crafted samples by the human cognitive system.
  • Characterizing the adversarial space of DL networks. We introduce a set of novel metrics to quantify the vulnerability of various DL models to the adversarial samples. Our metrics provide a systematic approach to explain the adversarial input space.
  • Providing a set of robust countermeasures to minimize DL vulnerability against adversarial samples customized to the limits of the pertinent model topology. The proposed countermeasure explicitly targets the subspace embeddings of the DL model and maximizes the distances between the corresponding manifolds in the latent feature space. Our methodology identifies the outliers in the input/output space by learning the corresponding Probability Density Function (PDF) for each category. Our results show superior detection compared to all the earlier works, by defeating all reported attacks in the literature to the highest possible degrees.

Protection of the Machine Learning IP by Watermarking and Fingerprinting

The popularity of Machine Learning, and in particular deep learning (DL) by neural networks has raised practical concerns about the ownership as well as the unintended redistribution of the pre-trained models. We believe that embedding digital watermars/fingerprints into DL models is critically important for a reliable technology transfer. A digital watermark/fingerprint is a type of marker covertly embedded in a signal or IP including audio, video image, or functional design. Watermarking has been immensely leveraged over past decade to protect the ownership of multimedia and video content, as well as digital circuit functionality. Extension of watermarking techniques to DL models and particularly deep neural networks, however, is still in its infancy. Our novel technology enables coherent integration of robust digital watermarks/fingerprints in contemporary deep learning models. Our solution, for the first time, introduces a generic functional watermarking/fingerprinting methodology that is applicable to both black-box and white-box settings. We emphasize that our approach is compatible with existing solutions such as Generative Adversarial Networks (GAN) and Reinforcement Learning (RL), provisioning IP protection in wide applications. Read more

Our joint center's recent research on this topic is focused on two separate but inter-linked thrusts as detailed below:

  • Designing a generic watermarking methodology that is applicable in both white-box and black-box settings, where the adversary may or may not know the internal details of the model. We provide a new and comprehensive set of metrics to assess the performance of a watermark embedding approach for DL models.
  • Developing a collusion-resilient fingerprinting technique in the context of neural networks for tracing the usage of distributed models. We leverage the theory of anti-collusion codes to design a unique fingerprint for each user and adapt our watermarking methodology for fingerprinting. We introduce a set of requirements for the fingerprints in the DL domain and demonstrate that our fingerprinting technique satisfies all the criteria.

The Game of Drones

We are increasingly surrounded by interconnected embedded systems, which collect sensitive information, and perform safety-critical operations.

Most embedded systems perform simple tasks upon reception of a command, in a predefined manner. However, in recent years, embedded systems have been increasingly designed to carry out autonomous collaborative tasks.

Networks of autonomous embedded systems, such as, vehicular ad-hoc networks, robotic factory workers, search/rescue robots, and drones, are already being used for performing urgent, tiresome, and critical tasks with minimal human intervention.

For example, drones are (envisioned to be) used for various tasks, such as search and rescue, construction site management, security and surveillance, cargo delivery, and natural disasters prediction and warning. Read more

IoT Security Lab

IoT devices are being widely deployed in many areas of life. For example, they can help users control and automate features of their smart homes and optimize the use of energy and resources as well as control traffic in smart cities and buildings. However, as IoT is an rapidly emerging area, many device manufacturers are emphasizing speed to market and bring out IoT products that have not been properly designed nor tested with security in mind. As a result, many devices have vulnerabilities making them easy to compromise by malicious attackers. Partly due to this, a new category of malware specifically targeting IoT devices has emerged, being responsible for high-impact security attacks against prominent Internet services. Read more

In our instute, we focus on developing security solutions for IoT. In order to enable effective security management of the very heterogeneous IoT device base in smart home and small office environments, we have developed machine learning-based device identification methods allowing deployment of proactive security measures for defending IoT devices against potential attacks and containing risks emerging from vulnerable devices.

Another challenge we are investigating is related to the problem that existing intrusion detection techniques are often not effective in detecting compromised IoT devices in a timely manner given the enormous number of different types of devices and manufacturers involved. To resolve this challenge, we are developing methods for automated and autonomous device-type specific detection of deviant behaviour that can be attributed to malicious compromise, e.g., due to infection by IoT malware. In addition, we are investigating the use of sofware-based methods for providing fine-grained control over the connectivity of devices in the local IoT networks and applying a cyber deception-based architecture for protecting networks against external and internal attackers.

Testing and evaluating employed security solutions in IoT devices architectures forms an important basis on which to base assessment of the big picture of security in IoT, as well as developing a roadmap for developing secure IoT solutions. We are therefore actively performing security analyses of prominent IoT devices and services like voice-based virtual assistants in our IoT hacking lab. This provides us a thorough understanding of designs and solutions employed in the IoT space which is required for systematic assessment of their security and needs for improvement.

Inaugural Workshop

Thursday, February 28, 2019

8:30 – 9:00 AM Registration and Networking
9:00 - 9:15 AM Directors WelcomeFarinaz Koushanfar, Co-Director, Center for Machine Integrated Computing and Security (MICS) & Professor, Electrical and Computer Engineering
9:15 - 9:40 AM Trends in AI with Privacy and SecurityFabian Boemer and Ro Cammarota, Intel AI Research
9:40 - 10:05 AM CYSEC in CYSMICSAhmad-Reza Sadeghi, TU Darmstadt
10:05 - 10:30 AM Machine Learning for SystemsAzalia Mirhoseini, Google Brain
10:30 - 10:45 AM Break
10:45 - 11:10 AM MICS in CYSMICSFarinaz Koushanfar, UC San Diego
11:10 - 11:35 AM Protecting Existing Smart Contracts Against AttacksLuca Davi, University of Duisburg-Essen
11:35 - 12:00 PM Challenges in Privacy-preserving Data AnalysisKamalika Chaudhuri, UC San Diego
12:00 - 12:35 PM Lunch and Networking
12:35 - 1:00 PM Secure Execution and Its ApplicationsSrdjan Capkun, ETH Zurich Foundation
1:00 - 1:25 PM Accelerating Intelligence: An Edge to Cloud ContinuumHadi Esmaeilzadeh, UC San Diego
1:25 - 1:50 PM IoT SecurityGene Tsudik, UC Irvine
1:50 - 2:15 AM Top Picks in Real World AI Security and Privacy Panelists:
  • Srdjan Capkun, ETH Zurich Foundation
  • Tara Javidi, UC San Diego
  • Casimir Wierzynski, Intel AI Research
  • Azalia Mirhoseini, Google Brain
Panel Moderator: Ahmad-Reza Sadeghi, TU Darmstadt
2:15 - 3:00 PM Open Discussions and Networking
Powered by Eventbrite

Principal Investigators

Ahmad-Reza Sadeghi, Technische Universität Darmstadt.
Ahmad's expertise includes trusted and secure computing and security and privacy for Internet of Things.

Farinaz Koushanfar, University of California San Diego (UCSD).
Farinaz's expertise includes data analytics in constrained settings, embedded systems security, and robust machine learning.

Tara Javidi, University of California San Diego (UCSD).
Tara's expertise includes statistical analysis and robust machine learning.

Publications

  • M. Sadegh Riazi, Mohammad Samragh, Hao Chen, Kim Laine, Kristin Lauter, Farinaz Koushanfar. XONN: XNOR-based Oblivious Deep Neural Network Inference. 28th USENIX Security Symposium. August 2019.