top of page

How to extract data and download files from a URL more efficiently

💡 Do you need to download files from multiple URLs where each URL is a unique download link? You've got you covered.

Problem Description


You need to extract a list of URLs and perform certain actions thereafter) For example, you need to navigate to the following web page:

https://www.i-pex.com/library/white-paper


This page contains a list of URLs containing white papers. You will need to navigate to each link to download the PDF files, which can be tedious when using UI Elements.



Solution


The following flow provides a high level overview of the solution:


Firstly, we will need to extract the list of white papers as a datatable using the action “Extract data from web page”. Remember to specify the store data mode as “Variable”.



While the Extract data from web page action is open, navigate to ‘https://www.i-pex.com/library/white-paper’ browser manually. Right click the first title > Select Extract Element Value > Select Href


Right click the second title > Select Extract Element Value > Select Href


On the Live web helper pop up, it will automatically extract the list of remaining items. You may update the column name if required > Click Finish


Use an "If" action to check if the download directory exists. If it does not, create the directory.


Then, using the action “For each”, we will loop through each of the datarow “CurrentItem“ in the datatable. For each iteration of the loop:

  1. We use the action “Go to web page” to navigate to the URL “%CurrentItem[‘URL’]%

  2. We use the action “Extract data from web page” to extract the file URL. Remember to specify the store data mode as “Variable”.



While “Extract data from web page” action is open, navigate one of the item manually. Right click the title > Select Extract element value > Select Href > Click Finish


We use the action “Download from web” to download the file

> Specify the “URL” as “%Link%

> Configure the “Save response” and “File name”

> Specify the “Destination folder”


Close web browser after the end loop action.







Additional Information


  • Last updated on: 9 Dec 2024

  • Tested version(s): 2.50.00183.24303

  • Prerequisites: Browser (e.g. Chrome)

  • Dependencies: None

  • Known issues: None

References


  • Nil

bottom of page