Close
Sidebar
Search tutorials
Get Started
Documentation
Stragidied Sampler
As the computation of AUC involves both positive and negative samples, it is essential that each class's samples are present within a batch. This necessity has led to the development of the followingStratifiedSampler
class:
CLASS StratifiedSampler (class_vector, batch_size, rpos=1, rneg=4) [SOURCE]
When the number of classes is equal to 2, sample according to the ratio of rpos and rneg. If the number of samples is not enough, complete by copying and sampling.
When the number of classes is greater than 2, sample according to the proportion of each class in the dataset.
When the number of classes is less than 2, an error is reported.
In addition, after sampling, the index will be randomly scrambled and this portion of data will be removed from the dataset.
|
The input vector. Number of samples per batch. They are only used in binary classification problems to control the sampling ratio of positive and negative class samples. |
---|
Example:
import torch
from easydict import EasyDict as edict
from XCurve.AUROC.dataloaders import get_datasets
from XCurve.AUROC.dataloaders import get_data_loaders
# set dataset params, see our doc. for more details.
dataset_args = edict({
"data_dir": "cifar-10-long-tail/", # relative path of dataset
"input_size": [32, 32],
"norm_params": {
"mean": [123.675, 116.280, 103.530],
"std": [58.395, 57.120, 57.375]
},
"use_lmdb": True,
"resampler_type": "None",
"sampler": { # only used for binary classification
"rpos": 1,
"rneg": 10
},
"npy_style": True,
"aug": True,
"class2id": { # positive (minority) class idx
"1": 1, "0":0, "2":0, "3":0, "4":0, "5":0,
"6":0, "7":0, "8":0, "9":0
}
})
train_set, val_set, test_set = get_datasets(dataset_args) # load dataset
trainloader, valloader, testloader = get_data_loaders(
train_set,
val_set,
test_set,
train_batch_size=32,
test_batch_size =64
) # load dataloader