Open Data

A core principle of my DMLab’s research is open science. The datasets below are publicly available and have been used as community benchmarks for space weather forecasting, solar image analysis, and machine learning research. All datasets are free to use with appropriate citation.


Space Weather & Solar Physics Datasets

SWAN-SF β€” Space Weather ANalytics for Solar Flares

A comprehensive multivariate time series (MVTS) benchmark dataset extracted from NASA SDO/HMI solar photospheric vector magnetograms, covering 4,098 active regions from May 2010 to December 2018 with 51 flare-predictive parameters. Widely used as a community testbed for solar flare prediction models.


PIL Dataset β€” Magnetic Polarity Inversion Lines from SDO/HMI

A large-scale publicly available dataset of magnetic polarity inversion lines (PILs) covering solar cycle 24 (May 2010–March 2019), including PIL raster masks, region of polarity inversion (RoPI), and multivariate time-series metadata extracted from 4,090 HARP series.


Multimodal PIL Dataset β€” SDO/HMI Magnetic Polarity Inversions for Flare Forecasting

A supervised-format extension of the PIL dataset integrating rasters, convex hulls, and multivariate time series with flare class labels (FQ, C, M, X) for May 2010– January 2019, designed for direct use in machine learning pipelines.


Large-Scale SDO Image Dataset for Computer Vision

A standardized and curated large-scale dataset of solar events from NASA SDO/AIA high-resolution images, compiled to accelerate computer vision research on solar image data by reducing data acquisition and curation overhead.


Curated Image Parameter Dataset from SDO/AIA

A large image parameter dataset extracted from NASA SDO/AIA instrument covering January 2011 onward at 6-minute cadence across nine wavelength channels (~1 TiB/year). Includes a public API for programmatic access.


Large-Scale Solar Event Reports from SDO Feature Recognition Modules

A comprehensive dataset of over 280,000 event reports for seven types of solar phenomena collected from automated SDO Feature Finding Team modules and reported to the Heliophysics Event Knowledgebase (HEK).


Using These Datasets

All datasets are freely available for research and educational use. If you use any of these datasets in your work, please cite the corresponding paper. For questions about data access, extensions, or collaboration, please contact us.