Technical Articles

 

Practical approach for response time prediction of HANA analytical applications

The practical approach for response time prediction of HANA analytical applications answers the question how the response time of single query, run on HANA, would differ on hardware with more or less physical CPU Cores or vCPUs, or when parallelization quota (limit) per query is set.

The outcome of applying this approach is used as input for the calculation of total required Cores or vCPUs for concurrent users running analytical queries on HANA in parallel.

 

INTRODUCTION

SAP HANA uses algorithms which automatically split and parallelize the execution of a sql query in several HANA threads. For this reason on multi-core machine the response time of same query is shorter compared to same query executed on machine with less cores.
In reality, the execution flow of a query has some sections which cannot be parallelized, for example, serializing the result set to be send to client, etc. With some queries such sequential (not parallelizable) sections are short, and with others they are longer.

 

HANA CONFIGURATION FOR PARALLEL EXECUTION

In SAP HANA Studio navigate to Administration Console -> Configuration view.
The following 4 parameters are related to the parallelization in HANA:

  • executor.ini -> limits -> max_pop_threads
  • executor.ini -> limits -> max_server_pop_threads
  • indexserver.ini -> parallel -> num_cores
  • indexserver.ini -> parallel -> phys_num_cores
  • global.ini -> execution -> max_concurrency
  • global.ini -> execution -> max_concurrency_hint

If some property is missing on concrete HANA installation, it has to be added.

In newer HANA releases the trend is that parallelization is controlled only with the parameter max_concurrency_hint. Nevertheless for the time being it is still recommended to set all parameters because of the staged approach for migration to the new consolidated parameter.

 

RESPONSE TIME MEASUREMENTS

The measurements should be done first with the default parameter values (let this be N ) , which usually represent the full CPU capacity of the HANA assigned hardware.

Next, the measurements should be repeated with parameter values, which limit the CPU capacity, which can be used by HANA, for example to N/2 Cores (vCPU), N-1 Cores (vCPU), N/4 Cores (vCPU), etc., – the choices should make sense for the concrete system size.

To change the parameter values to just  1 Core, run script:

ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM') SET ('parallel', 'phys_num_cores') = '1' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM') SET ('parallel', 'num_cores') = '1' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('execution', 'max_concurrency') = '1' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('execution', 'max_concurrency_hint') = '1' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('executor.ini', 'SYSTEM') SET ('limits', 'max_pop_threads') = '1' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('executor.ini', 'SYSTEM') SET ('limits', 'max_server_pop_threads') = '1' WITH RECONFIGURE;

To change the parameter values to 4 Cores, run script:

ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM') SET ('parallel', 'phys_num_cores') = '4' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('indexserver.ini', 'SYSTEM') SET ('parallel', 'num_cores') = '4' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('execution', 'max_concurrency') = '4' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('global.ini', 'SYSTEM') SET ('execution', 'max_concurrency_hint') = '4' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('executor.ini', 'SYSTEM') SET ('limits', 'max_pop_threads') = '4' WITH RECONFIGURE;
ALTER SYSTEM ALTER CONFIGURATION ('executor.ini', 'SYSTEM') SET ('limits', 'max_server_pop_threads') = '1' WITH RECONFIGURE;

Following the same logic, use the configuration parameters to simulate 8, 16… and so on number of Cores.

Note that the number of “simulated” cores must be smaller than the number of physically available cores. If on machine with 40 physical Cores, the parameters are configured to simulate 60 Cores, the measurements would not make any sense.

Note that with all conifguration the ‘phys_num_cores’,  ‘num_cores’,  ‘max_pop_threads’, ‘max_concurrency’ and ‘max_concurrency_hint’ are kept to the same value, e.g. the number of Cores which we like to simulate, while the ‘max_server_pop_threads’ keeps a contant at 1.

DO NOT FORGET: After the measurements are completed, reset the above parameters to the default configuration!

 

HOW TO INTERPRETE THE MEASUREMENTS

A prerequisite for interpretation of measurement result is that measurements are executed more than once and that the measurement result is reproducible. Measurements with more than 5% deviation are not acceptable.

Reproducible response time would be achieved only if no other queries are running in parallel and if the full CPU capacity of HANA hardware is available to the measured query at the time when it is executed.

For most of the HANA analytical queries, it is expected that the response time, measured with the parameter simulation of 1 CPU Core is longer than the response time, which is measured with parameter simulation of 2 CPU Cores and with default parameters values. The bigger the difference, the better is the scalability of the query.

Note! Situation when response time, measured with less simulated cores, is shorter than response time measured with more simulated cores, is not expected and the reasons for such wrong measurement must be clarified.

 

HOW TO PREDICT THE RESPONSE TIME

This would be explained with an example.

A dialog step (or HANA query) is measured with parameter simulation of 1, 2 and 4 CPU Cores simulation, with following results:

1 CPU Core:  125 [s] response time

2 CPU Cores:  80 [s] response time

4 CPU Cores:  57 [s] response time

Let P be the parallelizable part of the execution flow and S be the sequential part of the execution flow for the query, then according to  Amdahl’s Low

Response time [s] = S + P / N

where N is the number of “simulated” CPU Cores.

Applying the formula for N=1 gives 125 [s] = S + P/1

Applying the formula for N=2 gives 80 [s] = S  + P/2

Out of the calculation from the 2 equations, the result is that S = ~35 seconds and P = ~90 seconds.

Using the 3rd measurements with simulation of 4 Cores for control:

S  + P / 4 = 35 [s] + 90[s] / 4 = 35 [s] + 22.5 [s] = 57.5 [s]

The result from the formula , which is 57.5 [s]  matches very well the measurement, which is 57 [s] (difference less than 5%).

Further the response time with different number of CPU Cores, e.g. with 16 Cores, 40 Cores and so on can be estimated.

This estimation makes clear where is the limit of number of CPU Cores which significantly improve the response time and make sense to invest into from hardware perspective.

In this concrete example, due to specifics of this concrete query, more than 16 CPU Cores do not improve the response time significantly and even with more than 8 CPU Cores the improvement is quite little; therefore customers do not need to invest in a bigger system, unless concurrent load is expected.

ResponseTimeAmdahl.png

There are queries, for example for aggregates (max, sum, etc.) which scale excellent if appropriate HANA partitioning for high data volumes is used and thus can easily scale to and even above 40 cores.

To continue with sizing for concurrent users of HANA Analytical applications, read https://blogs.sap.com/2014/11/28/cpu-sizing-for-concurrent-users-running-hana-analytical-applications