Correction of the recording artifacts and detection of the functional deviations in ECG by means of syndrome decoding with an automatic burst error correction of the cyclic codes using periodo grams for determination of code component spect ral range Part II : Old mathematics for the novel applied p

1 Institute of Energy Problems of Chemical Physics, Russian Academy of Sciences Russia, 119334 Moscow, Leninsky prosp. 28, building 2 2 Institute of Biology and Chemistry, Moscow Pedagogical State University Russia, 129164, Moscow, Kibalchicha str. 6, building 1 3 Bakulev Scientific Center of Cardiovascular Surgery Russia, 121552, Moscow, Rublevskoe sh. 135 * Corresponding author: phone: +7 (915) 492-29-43 , e-mail: neurobiophys@gmail.com


Recognition of the statisti cally relevant signal components using Lomb periodograms and phase pseudo-Scargle sampling control
In general, ECG spectral analysis is based on the studies of the repetitive oscillations of the heart rate with different periods, and hence, can be easily performed in the framework of the cyclic (repeating) codes.In this case the power of periodic oscillations can be determined from the rhythmogram which is regarded as a unified process.In the standard extended spectral analysis of elecrocardiograms (including telemetric ones) the calculated parameters usually include: power spectrum density (TP) in different frequency ranges (HF at 0,75-3 Hz, LF at 0,02-0,75 Hz and VLF at < 0,02 Hz), the same parameters in normalized units (HFnu, LFnu), relative power in the same frequency ranges (HF%, LF%, VLF%), centralization index and sympatico-vagal balance (LF/ HF).The spectral power distribution provides information about the neuroendocrine regulation of the heart functions.Indicators LF, LFnorm, LF% characterize the sympathetic nervous system activity, while HF, HFnorm, HF% reflect the state of the parasympathetic nervous system, and the power spectrum VLF, VLF% indicate the humoral regulation level [1][2][3][4][5].The approach proposed does not exclude these parameters, but extends the standard approach by new descriptors.
Recognition of the frequency code elements within the cyclicity analysis and the statistically relevant choice with the noise reduction can be performed using the Lomb periodogram method [6], since within this approach for each frequency ω the counts t k are shifted by τ(ω), therefore, introducing the time counts ( ) , where k = 0, 1, ... , N-1 and limiting (φ 1 , φ 2 ) by the condition: makes it possible for τ(ω) to determine resulting in the following spectrum representation: where Phase analysis can be also performed using modified Lomb spectra in Scargle modification (for details see [7][8][9][10]), having the form of the Fourier transform which corresponds with the conventional Lomb spectra as.In general, from the least square method the formula for spectrum calculation is as follows: Wherein the timeline sampling sequence of the ECG parameters ( ) where k = 0, 1, ... , N-1, can be approximated by a model function where φ 1 (t) = cos ωt, φ 2 (t) = sin ωt.Using either numerical simulation data of normal and pathological electrocar-diograms [11] or methods of their approximation [12,13] makes it possible to detect the deviation from the normal value established from the approximation discrepancy ( ) , wherein the model coefficients can be found given that ||ε|| 2 = min from the system of normal equations: and its solution can be written as: From the general prerequisites it can be concluded that the spectrum formula (1.7) possesses a characteristic feature: where E min (ω) is the minimum of the normalized square residual, and ||х|| 2 is a dispersive power of the series.We used Vityazev notation [14][15][16] with the 1⁄2 multiplier for comparison with the conventional Schuster periodograms.Description of the similar approximation in ECG recognition, including its application for the pathological state coding using the cyclic code approach is considered in details in the Discussion section of this paper (Part 3).

The transition from the time to frequency transformations and vice versa
In the electrocardiographic/cardiometric case the principal values widely used in practice include the directly measured durations and intervals rather than indirectly calculated fre- ( ) quencies, wavenumbers, etc.For this reason we provided a program-reversible transition to the desirable units.An example of such an approach is given below, although in test experiments the abscissa units were represented in the counts per time unit.In a simple model case there is a calculation rule with many variables, which allows to obtain a timeline visualization for x(t) [17]: Frequency image for the Fourier transform X(ω) is obtained according to the rule: Frequency parameters for the Fourier transform and its analogs can be transformed using the Parseval's identity [18] into the time parameters and vice versa:

If
Considering this problem from the dialectical positions outlined in the above cited paper by Peschel M. "Modellbildung für Signale und Systeme" and using the following simplification: one can obtain the inequality between the time and frequency moments: which at k = l = 1 is equivalent to the Heisenberg uncertainty relation and consiquently, minimization of the central time moment results in the increase of the central frequency moment and vice versa if the signal value remains constant.This puts certain and evident limits in the ECG pattern recognition during encoding and machine fingerprinting of electrocardiograms.

The check digit method (syndrome decoding)
This method is interpreted here for the electrocardiographic data analysis in accordance with the monograph [19,20] considering extrapolation to the more complex and non-binary codes [21][22][23][24][25], but in the form different from the first Hamming idea on the self-checking and self-correcting codes [26].It is noteworthy that the latter allows to perform analysis of telemetric [27], and hence, biotelemetric data, which makes it suitable for on-line analysis of RF telemetric ECG data.
Assuming that the code in the reference cardiogram model is given as: where wherein there is a coding function which can be considered as a bijection mapping any relevant diagnostic element of the ECG sampling to the element of the set (field) of its automatically-identified values due to its injection and surjection properties.In the (26) , , , ( ) , , , ( ) particular case the latter property can be written as: and that is Then any cardiogram can be considered as a code combination realization with the same indices: where the check symbol (e.g. a reference peak P under the myocardium excitation, the QRS complex elements when starting from the ventricular systole or the ST or T segments during repolarization of the ventricular myocardium) can be written as where c ij is the coefficient of the independent variable x i in the linear form F j (x).Considering the sampling (combination) with the identified noise vector the information symbols of the combination can be written as while its check symbols in generalized form can be represented as The check symbol value calculation β* j from the information symbols and the relation , , , , ( ) (33) , ,  , , , ) in this case will be equal to the calculation of the differences where the set of elements G is a check digit or a syndrome.This digit allows not only detection of the signs of artifacts or pathology, but also determination of their location in the numerical data sample of the ECG.Since it is clear that the elements of the check digit depend only on the values of the vector components, which means that this check digit is invariant with respect to the code combinations, and hence, it can be used as a stable diagnostic sign, i.e. as a basis for design of the robust ECG data analysis algorithms.
It is important to mention a high resistance of the single input Lomb periodograms (for templates and cyclic code data) to the transmission changes under filtering using different frames.In contrast, the power fluctuations are more significant (see Part III of this paper), so it could not be applied for the correct morphological analysis of electrocardiograms [28,29].This is also the reason for inapplicability of the power-based (not matched) filters as morphological filters for ECG analysis [30] and diagnostic waveform converters [31].
It is also necessary to discuss the possible obstacles and specific details arising from the implementation of the above approach, since this paper describes only the formal method- (35) ∑ ological basis, although the software for automatic ECG data processing using cyclic code decoding is under development.

Methodological and terminological issues
The main problem is that ECG sequences can not be unambiguously attributed to the cyclic codes, despite the fact that application of the BCH code seems to be the most optimal among the programs for detecting multiple independent pathological disfunctions (or independently occurring mistakes or artifacts) [32,33].In fact, cycles over the field GF (PQRST) if we denote P ˅ Q ˅ R ˅ S ˅ T as the reference symbol in a really observed at the ECG state, when the reference symbol is placed between the informational ones with such regularity that there is a known (calculable from the above condition) number of reference symbols for each given number of informational symbols and vice versa, allows to consider ECG data as the sets approximated by the Hagelbarger recurrent codes [34].On the one hand, ECG coding belongs to the separable codes since the functions of each of the symbols P ˅ Q ˅ R ˅ S ˅ T are well known, and hence, according to the registration stability they can be referred to either informational (variables changing in the predetermined range and indicating the organism status or the parameters of the registration process) or reference symbols.On the other hand, it is possible to perform a more frequent checking on each cycle Р-Р, Q-Q and to use a number of symbols (e.g. a QRS complex) as the reference.Then the real form of the measuring-computing process will better correspond to the Elias iterated codes, which use a number of checking systems (including table ones in accordance with the reference template or a database in modern versions) [35].The shift registers can be also applied here as with the more simple cases (the only problem arose when detecting U-component after T-component in the model data).
In the most primitive case the model of PQRST-control can be represented as a finite automata: where X stands for the input, Y -the internal and Z -the output coordinate vectors, τ determines the time moment (clock).The external input set x 1 τ , ... , x n τ with the internal input state y 1 τ , ... , y m τ transforms the final dynamic system to the state, represented by the internal input state y 1 τ+1 , ... , y m τ+1 , preceded by the external output set z 1 0 τ , ... , z k τ and an internal output state w 1 τ , ... , w l τ at the moment τ.Now let us compare the functions, implemented by the apparatus on the normal tissue (a zero case) and on the tissue with the i-type diagnosed pathology: Ψ 0 = Ψ 0 (Λ, τ), and Ψ i = Ψ i (Λ, τ), where Λ stands for the control actions, аnd Ψ -the actions performed.When there is a certain diagnosed heart pathology s i , the rhythmogram registers a known deviation s i from the reference normal state, described by the function Ψ i = Ψ i (Λ, τ), determined in the same set Т that takes the values from the same set R, r ij = Ψ i (t j ) that a function Ψ 0 , performed by a reference system without any detectable deviations from the normal parameters.Single checks t j j = 1, 2, ... , |T| and their results in this case unequivocally correspond to the functions Ψ i , i = 0, 1, ... , M. A formal law for the checking (diagnostics) of the organism state and ( ) localization of the functional cardiophysical deviations in the attribute space in this case is a checking of the organism state relatively to the conventional normal state based on the difference between the functions φ i and φ k , i, k = {0, 1, ... , M}, i ≠ k in this checking t j according to the relation a ik , j A ∈ taking two values: However, such a simplified binary approach allows only detecting a PQRS-outlier in the investigated field different from the normal one.At the same time the distinction between various pathologies is beyond the scope of this method, since it requires multiple multiparametric / multivariative calibration.This corresponds either to the simplified search and prevention of the artifacts or to the most early stages of automated diagnostics when the diagnostic result of the certain k -patient is represented in the form of a triple vector f k {s i }, where s i = 1 in the presence of the symptom, s i = 0 if it is not observed and s i = -1 if this symptom was not investigated (i, 1, ... , m).Based on the classical works on medical cybernetics performed by Brodman and coauthors [36][37][38][39][40], different authors using a similar approach represented the diagnostic value ρ(s i , d j ) of the symptom s i for a diagnosis d j (j, 1, ... , n) as follows: or applied the Kullback divergence [41] as the informativeness measure: or even used the Shannon information measure: , In terms of deterministic logics, the disease model is based on the comparison of the certain unknown vector f k with the standard one.The standard (physiological norm or reaction norm PQRST) can be stored in memory as a Boolean function Wherein it is believed that Using statistical methods, the disease model is built up by finding the most plausible estimate, and under minimization of the average diagnostic risks the optimal Bayesian rule is applied: The t value at which the maximum is reached is found, while at t = 0 the decision to refuse the diagnostics of the vector f k is made.However, even Willson pointed out that Bayesian approach cannot be fully automated and is rather subjective [42].Let us prove this idea on the fairly trivial examples.When using multialternative sequential analysis the computational law requires to continue the calculation of the likelihood ratio estimation calculation: if it is assumed that ∈ , and and if Λ ≤ B tj , where d j , d t (j, t=1, ... , n, t≠j) are classes of diseases (diagnoses); A t,j and B t,j -thresholds, determined from the given reliability of diagnostics (supervised learning) for each pair of the classes compared; ) is a likelihood coefficient; p(s i /d j ) and p(s i /d j ) are a priori probabilities of emergence of the s i -symptom in the d j -and d t -classes.From the standpoint of multiparametric diagnostics, in the presence of a plurality of functions f {f i (α)} (i = 1, 2, ... , M), where α -is a certain alternative, it is possible to consider the multiparametricity problem in the decision making during diagnostics.If α 0 is an effective alternative for the multiple criteria f {f i } (i = 1, ... , M), then α 0 is an effective alternative for a set of functions W = {w i (f i (α))} (i = 1, ... , M), where w i (f i (α)) is a monotonic function f i (α), and vice versa.Monotonic transforms for maximized criteria include: and for the minimized criteria: where f i 0 is an optimal value of the i-criterion, f i min is the minimal value of the maximized criterion, f i max -the maximal value of the minimized criterion.The above values are calculated at ∑ at sufficiently general conditions gives a number of effective alternatives.A palliative solution gives a minimal relative deviation from the optimal values for all the criteria according to the weight coefficients ρ i , such as If the criteria are equifinal, ρ i = 1/M.In other case the palliative solution will be that with the equal weighted mean values: Then w i satisfy the condition 0 < k 0 w i < 1 in the case of the equifinal criteria, оr )) ... .
eters, but also many other physically relevant but seldom used in biomedical practice parameters, which together make up a set of values which can be considered as a reliable sufficient basis for an identification and confirmation of the diagnosis.
As follows from the above approach, during the compactification (i.e. the convolutional coding) of the full massive of the ECG data convolved into the discrete PQRST-code index sampling it is impossible to isolate from the massive the variables not included into the sampling.Therefore the data massive coding should be initially performed in such a way that the variables which cannot be subsequently isolated should be the coding criteria.In other words, recognition of the code elements should be extended even compared with the spectral or rhythmogram-/ periodogram-based approach.One should introduce a more complex scheme considering phase, delay, analysis of the information transduction thermodynamics in the code (Kolmogorov entropy), the real and imaginary parts, hierarchical decompositions, etc.We have already introduced such approaches to the bioacoustic fingerprinting for pulmonological purposes [43], mathematical bioacoustics [44], spectral analysis of the self-oscillating and autowave processes in biochemical kinetics [45], synchronous analysis of fluorescence and electrophysiology at the cell level [46], phenology and phenospectral phytochemical analysis [47], multiwave methods of medical chronaximetry [48], methods of radiofrequency multiparametric spectral identification of the cell and tissue telemetric signals [49], population-species analysis of avifauna [50] and its multiparametric identi-fication [51], analysis of the scanogram of the so-called «metrological forceps» and «metrological lancets» in real time [52][53][54], patch-clamp data analysis [55,56].Therefore, it is a relatively simple matter to apply a number of the above methods for the ECG signal processing, but it is beyond the scope of this paper and will be described in our forthcoming papers.

Statement on ethical issues
,