timesead.data.preprocessing.common ================================== .. py:module:: timesead.data.preprocessing.common Functions --------- .. autoapisummary:: timesead.data.preprocessing.common.update_statistics_increment timesead.data.preprocessing.common.save_statistics timesead.data.preprocessing.common.minmax_scaler timesead.data.preprocessing.common.standard_scaler Module Contents --------------- .. py:function:: update_statistics_increment(frame: pandas.DataFrame, mean: numpy.ndarray = None, min_val: numpy.ndarray = None, max_val: numpy.ndarray = None, old_n: int = None) -> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, int] Incrementally compute feature-wise mean, minimum, and maximum values for a dataset consisting of multiple :class:`~pandas.DataFrame`\s. :param frame: The current data with wich the statistics should be updated. :param mean: Mean value returned by previous calls to the function or `None` if this is the first call. :param min_val: Minimum value returned by previous calls to the function or `None` if this is the first call. :param max_val: Maximum value returned by previous calls to the function or `None` if this is the first call. :param old_n: Total number of data points processed in earlier calls to the function or `None` if this is the first call. :return: A tuple `(mean, min_val, max_val, old_n + n)` that contains the updated statistics and the total number of processed data points. .. py:function:: save_statistics(frame: pandas.DataFrame, path: str) Compute feature-wise mean, standard deviation, minimum, and maximum values for a dataset consisting of a single :class:`~pandas.DataFrame` and save them as a `.npz` file. :param frame: The dataset for which to compute and save statistics. :param path: Path to save the statistics via :func:`numpy.savez`. .. py:function:: minmax_scaler(frame: Union[pandas.DataFrame, numpy.ndarray], stats: Dict[str, numpy.ndarray]) -> pandas.DataFrame Scale features in a dataset independently to the [0, 1] range using pre-computed minimum and maximum values. :param frame: The dataset to scale. This should either be a :class:`~pandas.DataFrame` or a :class:`~numpy.ndarray` of shape `(N*, D)`. :param stats: A dictionary that contains pre-computed feature-wise minimum and maximum values as :class:`~numpy.ndarray`\s of shape `(D,)` in the keys 'min' and 'max', respectively. :return: The scaled :class:`~pandas.DataFrame` or a :class:`~numpy.ndarray`. Note that this will usually be a copy as this function does not modify any values in place. .. py:function:: standard_scaler(frame: Union[pandas.DataFrame, numpy.ndarray], stats: Dict[str, numpy.ndarray]) -> pandas.DataFrame Scale features in a dataset such that they have zero mean and unit standard deviation using pre-computed mean and standard deviation values. :param frame: The dataset to scale. This should either be a :class:`~pandas.DataFrame` or a :class:`~numpy.ndarray` of shape `(N*, D)`. :param stats: A dictionary that contains pre-computed feature-wise mean and standard deviation values as :class:`~numpy.ndarray`\s of shape `(D,)` in the keys 'mean' and 'std', respectively. :return: The scaled :class:`~pandas.DataFrame` or a :class:`~numpy.ndarray`. Note that this will usually be a copy as this function does not modify any values in place.