import pandas as pd
from dataprep.eda import create_report


# Ignore warnings
import warnings
warnings.filterwarnings('ignore')


ds = pd.read_csv('SKAB\ds.csv', index_col=0)


create_report(ds)

Number of Variables	9
Number of Rows	45001
Missing Cells	0
Missing Cells (%)	0.0%
Duplicate Rows	701
Duplicate Rows (%)	1.6%
Total Size in Memory	5.9 MB
Average Row Size in Memory	136.7 B
Variable Types	Numerical: 8 Categorical: 1

Accelerometer1RMS and Accelerometer2RMS have similar distributions	Similar Distribution
Accelerometer1RMS is skewed	Skewed
Accelerometer2RMS is skewed	Skewed
Current is skewed	Skewed
Pressure is skewed	Skewed
Temperature is skewed	Skewed
Voltage is skewed	Skewed
Volume Flow RateRMS is skewed	Skewed
Dataset has 701 (1.56%) duplicate rows	Duplicates
y has constant length 3	Constant Length

Accelerometer1RMS has 24707 (54.9%) negatives	Negatives
Accelerometer2RMS has 24702 (54.89%) negatives	Negatives
Current has 8118 (18.04%) negatives	Negatives
Temperature has 30534 (67.85%) negatives	Negatives
Thermocouple has 24700 (54.89%) negatives	Negatives
Volume Flow RateRMS has 25328 (56.28%) negatives	Negatives

Approximate Distinct Count	17200
Approximate Unique (%)	38.2%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	-1.6485
Minimum	-4
Maximum	5
Zeros	2
Zeros (%)	0.0%
Negatives	24707
Negatives (%)	54.9%

Minimum	-4
5-th Percentile	-4
Q1	-4
Median	-3.5471
Q3	0.6796
95-th Percentile	1.7435
Maximum	5
Range	9
IQR	4.6796

Data characteristics of the SKAB dataset¶

Dataset description¶

Exploratory Data Analysis¶

Overview

Dataset Statistics

Dataset Insights

Variables

Accelerometer1RMS

Quantile Statistics

Descriptive Statistics

Accelerometer2RMS

Quantile Statistics

Descriptive Statistics

Current

Quantile Statistics

Descriptive Statistics

Pressure

Quantile Statistics

Descriptive Statistics

Temperature

Quantile Statistics

Descriptive Statistics

Thermocouple

Quantile Statistics

Descriptive Statistics

Voltage

Quantile Statistics

Descriptive Statistics

Volume Flow RateRMS

Quantile Statistics

Descriptive Statistics

y

Length

Sample

Letter

Interactions

Correlations

Missing Values

Evaluating the results¶

Mean	-1.6485
Standard Deviation	2.628
Variance	6.9065
Sum	-74185.9889
Skewness	0.4885
Kurtosis	-1.1319
Coefficient of Variation	-1.5941

Approximate Distinct Count	16459
Approximate Unique (%)	36.6%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	-1.694
Minimum	-4
Maximum	5
Zeros	1
Zeros (%)	0.0%
Negatives	24702
Negatives (%)	54.9%

Approximate Distinct Count	40997
Approximate Unique (%)	91.1%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	0.004256
Minimum	-0.002151
Maximum	1.0457
Zeros	1
Zeros (%)	0.0%
Negatives	8118
Negatives (%)	18.0%

Minimum	0
5-th Percentile	0.375
Q1	0.5
Median	0.5
Q3	0.625
95-th Percentile	0.625
Maximum	1.125
Range	1.125
IQR	0.125

Mean	0.5097
Standard Deviation	0.09883
Variance	0.009767
Sum	22938.3915
Skewness	-0.07541
Kurtosis	0.7088
Coefficient of Variation	0.1939

Approximate Distinct Count	19961
Approximate Unique (%)	44.4%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	-1.8749
Minimum	-4
Maximum	1.9248
Zeros	3
Zeros (%)	0.0%
Negatives	30534
Negatives (%)	67.8%

Approximate Distinct Count	24336
Approximate Unique (%)	54.1%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	-0.06129
Minimum	-1.7123
Maximum	2.4124
Zeros	1
Zeros (%)	0.0%
Negatives	24700
Negatives (%)	54.9%

Approximate Distinct Count	26414
Approximate Unique (%)	58.7%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Memory Size	3270574
Mean	0.907
Minimum	-0.001631
Maximum	1.01
Zeros	1
Zeros (%)	0.0%
Negatives	6
Negatives (%)	0.0%

Mean	-1.9065
Standard Deviation	2.3741
Variance	5.6365
Sum	-85794.0342
Skewness	0.2596
Kurtosis	-1.9227
Coefficient of Variation	-1.2453

Count	0
Lowercase Letter	0
Space Separator	0
Uppercase Letter	0
Dash Punctuation	0
Decimal Number	90002

1st row	0.0
2nd row	0.0
3rd row	0.0
4th row	0.0
5th row	0.0