WikiDL
WikiDL.__init__
Initializes the WikiDL class for downloading specific snapshots from Wikimedia data dumps.
| Name | Type | Description |
|---|---|---|
| snapshot_date | str | The date of the snapshot to download. This field is required. |
| master_url | str | The master URL pointing to the data dump. Defaults to https://dumps.wikimedia.org/enwiki. |
| select_pattern | str | File selector pattern. Defaults to |
| custom_select_pattern | str | None | Custom select pattern for target file names. Defaults to |
| num_proc | int | Number of processes to use. |
| log_level | int | Logging level to use. Defaults to |
WikiDL.start
Starts the downloading task specified in the WikiDL instance.
| Name | Type | Description |
|---|---|---|
| output_dir | str | Output directory for downloaded files. This field is required. |
| limit | int | None | Maximum number of files to download. Useful for debugging. Defaults to downloading all matching files. |
This function does not return any value. The downloaded files will be saved in the specified output_dir.