Rapid growth in the use of engineering simulation tools – and in the demand for high performance computing (HPC) – is driving interest in cloud computing. Using the cloud for simulation presents unique challenges with different solution types required for specific use-cases. For many years, I have been on this journey with customers adopting cloud computing. Quite a few of them has been enabled through the UberCloud project. Let me share some lessons learned and key takeaways. I will basically do that by means of eight “best practices”:
- Don’t Move the Data (More Than You Have to)
- Remote Graphics & Graphics User Interface for End-to-End Simulation
- Secure Network Communications & Data Storage
- Effective End-User Access for Job & Data Management
- Re-Use On-Premise Licenses (or Not)
- Consider a Mix of Business Models
- Match Your Cloud to Your HPC Workload
- Start Small, Grow Organically… but Think Big
This post will cover the first four cloud computing best practices for engineering simulation.
1 – Don’t Move the Data (More Than You Have To)
The first best practice relates to data storage – and the idea of minimizing the transfer of data back and forth between the cloud backend and the end-user. Clearly, some data motion will be needed: the end-user may be doing CAD on the desktop and will need to move that CAD file up to the simulation center on the cloud. Luckily, these input files are relatively small – measured in MBs, and typically less than a minute to transfer. Simulation result files, on the other hand, are typically huge – gigabytes or even terabytes of data – and will take hours to days to download.
So the best practice is not to download the data – make sure that the cloud is both a compute and storage solution, at least for WIP data. That means you need data security on the cloud (I’ll come back to that) and you need backup/disaster recovery for that data.
2 – Remote Graphics & Graphics User Interface for End-to-End Simulation
The second best practice follows from the idea of leaving your data on the cloud. End-users will need to perform full end-to-end simulation on the cloud, meaning not just batch solves but also interactive GUI processes and graphical post-processing. Most simulation workloads involve 3D graphics, so you will need a remote graphics software tool with server-side acceleration and good performance over the network – and reasonable network latency. And you’ll want full remote desktop access to that server (not just the application in a window) so that you can edit and manage files, maybe compile add-in routines, etc.
All this implies a graphics server on the cloud, which needs sufficient memory to load and display large simulation models. One of the issues I have seen is that not all cloud back-ends support this, and this figure shows a solution architecture that ANSYS has been demonstrating in conjunction with partners Nice and AWS – using a high-memory server instance to run the application, with Nice DCV handling graphics via external rendering on a lower-memory graphics server. Through the Enterprise Cloud solution, customers have experienced that this can be a robust and economical way to enable large model graphics on the cloud, without large investment in multiple high-memory graphics servers.
3 – Secure Network Communications & Data Storage
This best practice probably relates to the biggest concern that most companies have when thinking about an external cloud solution for simulation. Simulation models contain the ‘crown jewels’ – product data – and the customer needs to be convinced that this data is secure and IP is protected.
For the sake of simplicity, data security really comes down to two ideas:
- Encrypting of data in motion depends on a secure network protocol – and I see two gold standards here:
- The first is to establish a site-to-site VPN. This is a significant effort to put in place (in terms of IT resource) but provides scale to support many users and lots of data.
- The second approach is to perform all data transactions within a web user interface over HTTPS, which is more pragmatic if you are enabling just a small group of end-users or using the cloud intermittently.
- Securing data at rest should similarly be accomplished via encryption – ensure nobody can read it if compromised. This can be done on the file system level or application level. Doing it at the file system level is the easiest though but depends on having the right file system tools available.
I see our cloud-hosting partners setting up dedicated storage accessible only by an individual customer account, so that the stored data is secured by physically isolating it. Some of them provide ITAR compliance, and AWS has GovCloud with controlled access. A key concept related to security – that I am personally quite keen on – is that of “divided responsibility” (AWS calls it “shared responsibility”):
- the cloud provider needs to provide physical (building) and internal network security,
- the customer needs to ensure the OS is patched for security. The customer needs to ensure that they are using applications that are secure, good access controls (who has access to what, license compliance, etc.), with secure network protocols.
- the Independent Software Vendor (ISV) needs to ensure that the software applications are secure – esp. cloud portal software needs to enable encrypted data transfer and data storage.
The ANSYS Enterprise Cloud is designed as a direct extension of a company’s enterprise IT infrastructure and resources while leveraging the public cloud platform of Amazon Web Services (AWS). Designed and implemented as a Single Tenant Cloud (STC), I can say that it is essentially a virtual data center on the cloud that addresses one of the major market concerns with cloud computing (i.e., data security and IP protection).
4 – Effective End-User Access for Job & Data Management
This best practice relates to a key concern I see when our customers are thinking about cloud: will end-users lose productivity? End-users will need simple, intuitive job submission procedures, tuned to the requirements of the specific applications they are using. They’ll need a way to monitor jobs in progress. They will need ways to move, find, and retrieve data in a centralized, secure environment.
I see our cloud-hosting partners creating web user interfaces that enable these capabilities. And ANSYS has built its own web-based interface, ANSYS Cloud GatewayTM, that manages the end-to-end simulation process. It provides a secure environment for model and results visualization, data storage and management, HPC job orchestration, and remote session management. It is also customizable to incorporate other non-ANSYS commercial software as well as in-house developed codes.
Stay tuned for the other 4 best practices which I will share with you next week!