I&D

Minerva - High Performance Computing


Resources

An HPC cluster is a computing system consisting of high performance components (CPUs, memory, network connections, and storage) that allows users to request a set of resources needed to run computationally intensive jobs, and that runs the jobs when the requested resources are available. The aforementioned jobs may consist of several tasks that are executed sequentially (without any king of parallelism), of tasks that can run in parallel on a single computer, as those not using Massage Passing Interface (MPI), or of tasks that can run in parallel spread over several computers interconnected by some sort of network intercommunication, as those using MPI. Regardless of the type of jobs submitted to the cluster, users do not have to be concerned with the time that a given job will take to complete in order to submit the next one. Instead, users submit all the jobs they want to run to a central workload manager, which, after scheduling the jobs based on a set of established priority rules, sends them to compute nodes when the requested resources are available.

Resources - Hardware

·      1´ Head and Storage Node (Dell PowerEdge R720)

o  2 Intel Xeon E5-2680v2 CPUs (10 cores each) @ 2.80 GHz, 15 MB cache, 8.0 GT/s QPI

o  160 GB of registered and ECC DDR-3 memory (10´16 GB RDIMM, 1866MT/s, Standard Voltage, Dual Rank, ´4 Data Width)

o  1 RAID Controller with 512 MB Non Volatile Cache

o  146 GB on a RAID1 array (2x146 GB Hot-Plug SAS HDDs, with 15,000 rpm and 6 Gbit/s)

o  1 HBA PCIe SAS controller with 2 external connectors with 4 ports each, with a transfer rate of 6 Gbit/s per port, connected to a storage unit (bellow)

o  1 Broadcom 5720 quad port 1 Gbit/s Ethernet network daughter card

o  1 Broadcom 57810 dual port 10 Gb DA/SFP+ Ethernet converged network adapter

o  1 HCA single port InfiniBand FDR 56 Gbit/s adapter card (Mellanox ConnectXR-3 VPI MCX353A-FCBT)

o  2 Hot-Plug power supplies in redundant mode

o  1 Dell iDRAC7 management board

·      1´ Storage Unit (Dell PowerVault MD3220), attached to the HBA PCIe SAS controller on the Head and Storage Node

o  2 SAS connectors with 4 ports each, with a transfer rate of 6 Gbit/s per port

o  2 RAID controllers, with 4 GB of aggregate cache memory

o  28.8 TB of raw capacity (24´1.2 TB Hot-Plug SAS HDDs, with 10,000 rpm and 6 Gbit/s)

o  2 Hot-Plug power supplies in redundant mode

·      1´ Storage Expansion Unit (Dell PowerVault MD1220), attached to the Dell PowerVault MD3220 Storage Unit

o  28.8 TB of raw capacity (24´1.2 TB Hot-Plug SAS HDDs, with 10,000 rpm and 6 Gbit/s)

o  2 Hot-Plug power supplies in redundant mode

·      20´ Compute Nodes (Dell PowerEdge R720)

o   2 Intel Xeon E5-2695v2 CPUs (12 cores each) @ 2.40 GHz, 30 MB cache, 8.0 GT/s QPI

o  192 GB of registered and ECC DDR-3 memory (12´16 GB RDIMM, 1866MT/s, Standard Voltage, Dual Rank, ´4 Data Width)

o  1 Integrated RAID Controller (PERC H710), with 512 MB Non Volatile Cache

o  146 GB on a RAID1 array (2x146 GB Hot-Plug SAS HDDs, with 15,000 rpm and 6 Gbit/s)

o  1 Broadcom 5720 quad port 1 Gbit/s network daughter card

o  1 HCA single port InfiniBand FDR 56 Gbit/s adapter card (Mellanox ConnectXR-3 VPI MCX353A-FCBT)

o  2 Hot-Plug power supplies in redundant mode

o  1 Dell iDRAC7 management board

·      1´ server for virtualized nodes (Dell PowerEdge R720)

o  2 Intel Xeon E5-2660v2 CPUs (10 cores each) @ 2.20 GHz, 25 MB cache, 8.0 GT/s QPI

o  128 GB of registered and ECC DDR-3 memory (8´16 GB RDIMM, 1866MT/s, Standard Voltage, Dual Rank, ´4 Data Width

o  1 RAID Controller with 512 MB Non Volatile Cache

o  300 GB on a RAID1 array (2´300 GB Hot-Plug SAS HDDs, with 15,000 rpm and 6 Gbit/s)

o  1 Broadcom 5720 quad port 1 Gbit/s network daughter card

o  1 Broadcom 57810 dual port 10 Gb DA/SFP+ converged network adapter

o  2 Hot-Plug power supplies in redundant mode

o  1 management board Dell iDRAC7

·      1´ InfiniBand switch (Mellanox MSX6036F-1SFR)

o  36 non-blocking QSFP ports, each with a latency less than 200 ns and a data transfer rate of 56 Gbit/s

o  Aggregate data transfer rate of 4.032 TB/s

o  InfiniBand/Ethernet gateway

o  2 Hot-Plug power supplies in redundant mode

·      2´ Ethernet Switches, 1 Gbit/s (Dell PowerConnect 5548)

o  48 ports, each at 1 Gbit/s

o  1 GB of RAM and 16 MB of flash memory

o  2 stacking HDMI ports

·      2´ Uninterrupted Power Supplies (AEC NST5400030015)

o  Three-phase units with 30 kVA each

o  40 internal batteries of 9 Ah in each unit, plus 40 external batteries of 9 Ah per unit

o  Operation in redundant mode

·      4´ Power Distribution Units (Racktivity ES6124-32)

o  Three-phase units with real time, true RMS measurements at outlet level (power, apparent power, current, power factor, kVAh, consumption)

o  Individual power-outlet switching through the network

 

Resources - Software

The cluster uses the Linux OS (distribution CentOS, version 7.3), and has installed the following software:

·      SLURM Workload Manager (resource manager, scheduler, and accounting)

·      Environment Modules (user environment manager)

·      Compilers

o  GCC (versions 4.8.5, 4.9.3, 5.4.0, and 6.1.0)

§  C/C++, Fortran, Objective-C/C++, and Java

§  Not subjected to license availability

o  Intel Composer XE (version 2016.3.210)

§  C/C++ and Fortran

§  2 network floating licenses

§  For academic use only

·      Message Passing Interface (MPI), for parallel computing

o  OpenMPI

§  Version 2.0.1

§  Available for all versions of the aforementioned compilers

o  MPICH2

§  Version 3.2

§  Available for all versions of the aforementioned compilers

o  MVAPICH2

§  Version 2.2

§  Available for all versions of the aforementioned compilers

·      Libraries

o  GMP (GNU Multiple Precision Arithmetic Library), version 6.1.0

o  ISL (GNU Integer Set Library), versions 0.11.1, 0.12.2, 0.14, and 0.16

o  MPFR (GNU Multiple-Precision Floating-point computations with correct Rounding), version 3.1.4

o  CLooG (library to generate code for scanning Z-polyhedra), versions 0.18.0 and 0.18.1

o  Intel DAAL (Data Analytics and Acceleration Library), version 2016.3.310

o  Intel IPP (Integrated Performance Primitives), version 2016.3.210

o  Intel TBB (Threading Building Blocks), version 2016.3.210

o  Intel MKL (Math Kernel Library – a BLAS library optimized for Intel processors), version 2016.3.210

o  Mellanox MXM (Messaging Accelerator – for IB communications), version 3.5.3092

o  Mellanox FCA (Fabric Collective Accelerator – for MPI software), version 2.5.2431

o  Mellanox HCOL (Fabric Collective Accelerator – for MPI software), version 3.6.1228

o  Mellanox KNEM (High-Performance Intra-Node MPI Communication), version 1.1.2.90

·      Matlab (version R2016a)

o  2 network floating licenses

o  For academic use only

o  The following toolboxes are available

§  Parallel Computing Toolbox (2 licences)

§  Distributed Computing Server (license for up to 64 workers)

§  Matlab Coder (1 license)

§  Matlab Compiler (1 license)

§  Simulink (2 licenses)

§  Simscape (2 licenses)

§  Simulink Verification and Validation (2 licenses)

§  Partial Differential Equations (2 licenses)

§  Statistics (2 licenses)

§  Curve Fitting (2 licenses)

§  Optimization (2 licenses)

§  Global Optimization (2 licenses)

§  Neural Networks (2 licenses)

§  Fuzzy Logic (2 licenses)

§  Signal Processing (2 licenses)

§  DSP System (2 licenses)

§  Image Processing (2 licenses)

§  Computer Vision (2 licenses)

§  Mapping (2 licenses)

§  Bioinformatics (2 licenses)

·      COMSOL Multiphysics (version 5.2a)

o  1 network floating license for the base software and all modules

o  For academic use only

o  The following modules are available

§  Electrical – AC/DC

§  Multipropose – Optimization

§  Multipropose – Particle Tracing

§  Multipropose – Material Library

§  Interfacing – LiveLink for Matlab

Besides the aforementioned software, additional software packages can be installed, as long as the following conditions are all met:

·      The software is for the Linux OS, and compatible with the CentOS 7.3 distribution;

·      The software is open source and free of charges, or if it is commercial the user/project/institution requesting it pays the fees and the license maintenance, when applicable (in this latter case, the software will be made available only to the users designated by the user/project/institution paying the fees);

·      The software is appropriate for cluster computing, which means that it must be able to run without user intervention, namely, without user interfaces of any kind.

How to request access

All researchers (teachers and students) of the Polytechnic Institute of Coimbra can access the Minerva Cluster for research (non-commercial) purposes, and they will have access to all the software described in the section Software Resources. Other users from outside IPC may also use the cluster, but fees may apply. Users interested in using the cluster must send an e-mail to admin@laced.isec.pt, requesting approval for an account. The aforementioned e-mail must contain a small description of the project for which the computational resources are required, as well as the resources needed, including: storage space, maximum computing time needed for each running job to complete, total requested computation time, and additional software (if any) that may be necessary to install. Users may also provide additional information that they consider pertinent.

More info available in Minerva User Guide

Contacts

CONTACTE-NOS

Pode contactar-nos preenchendo este formulário. Responderemos o mais brevemente possível.

Morada

Rua Pedro Nunes
Quinta da Nora
3030-199 COIMBRA
Portugal

Telefones

Telefone: +351 239 790 200
Fax: + 351 239 790 201

E-mail

info@isec.pt