Trovare immagini duplicate in Python con imagededup
imagededup è una libreria per Python che ci consente di trovare immagini duplicate.
E' abbastanza veloce, solo che scarica un bel pò di dipendenze.
Per installarla possiamo usare pip:
pip install imagededup
Qui sotto un esempio:
from imagededup.methods import PHash
img_dir = '/home/fermat/Scrivania/img/'
phasher = PHash()
encodings = phasher.encode_images(image_dir=img_dir)
duplicates = phasher.find_duplicates(encoding_map=encodings)
print(duplicates)
Questo l'output:
2024-04-17 09:46:33,178: INFO Start: Calculating hashes...
100%|██████████| 3/3 [00:00<00:00, 76.03it/s]
2024-04-17 09:46:33,271: INFO End: Calculating hashes!
/home/fermat/TEST/test_python/pythonProject/.venv/lib/python3.11/site-packages/imagededup/methods/hashing.py:317: RuntimeWarning: Parameter num_enc_workers has no effect since encodings are already provided
warnings.warn('Parameter num_enc_workers has no effect since encodings are already provided', RuntimeWarning)
2024-04-17 09:46:33,271: INFO Start: Evaluating hamming distances for getting duplicates
2024-04-17 09:46:33,271: INFO Start: Retrieving duplicates using Cython Brute force algorithm
100%|██████████| 3/3 [00:00<00:00, 2570.56it/s]
2024-04-17 09:46:33,322: INFO End: Retrieving duplicates using Cython Brute force algorithm
2024-04-17 09:46:33,322: INFO End: Evaluating hamming distances for getting duplicates
{'Apple wood.jpg': ['2.jpg', '1.jpg'], '2.jpg': ['Apple wood.jpg', '1.jpg'], '1.jpg': ['Apple wood.jpg', '2.jpg']}
Enjoy!
python pip imagededup
Commentami!