Use Case

How to scrape job board to a Knack database?

CHECK OUT THE BYTELINE FLOW

Here is the Byteline flow used for this use case. You can clone it by following the link
How to Scrape job boards to a Knack database?
1
Byteline flow runs on a scheduler
Byteline flow runs in the cloud at the scheduled time. You can configure the scheduled time based on your requirements.

Step by Step Instructions

Byteline step by step blog

Byteline allows you to scrape web pages without putting your coding skills into action. You can then directly export the scraped data to any Byteline integration. In this documentation, we have explained the steps to scrape a job board to a Knack database.

We will be configuring the following three nodes to create the Job Board:

Scheduler Trigger Node - First of all, we’ll have to configure the scheduler node to run the flow at a regular interval of time. 


Web Scraper Node - After that, we’ll need to configure the Web Scraper node to extract data from a webpage. Here, we will scrape data from the ZipRecruiter’s site.  


Knack - Update Records Node - Lastly, we’ll need to configure the Knack node to fetch and store the scraped data for creating a Job Board in your Knack account. 


Let’s get started.

1. Configure Airtable Trigger Node
Base ID
1. Heading Category
Sub-Heading
Sub-Heading
Sub-Heading
2. Heading Category
Sub-Heading
Sub-Heading
Sub-Heading
3. Heading Category
Sub-Heading
Sub-Heading
Sub-Heading
3. Heading Category
Sub-Heading
Sub-Heading
Sub-Heading
3. Heading Category
Sub-Heading
Sub-Heading
Sub-Heading

Create Flow

In this section, you’ll learn to create the flow. For more details, you can check How to Create your First Flow Design.

 

Step 1: Enter the flow name to create your new flow.




Step 2: Select the Scheduler trigger node to initiate your flow. 


Now, you need to configure the Scheduler Node to run the flow at a regular interval of time.


So, let’s get started with Scheduler node configuration! 

Configure Scheduler

Step 1: Click on the edit button to configure the scheduler node. 



We will keep the default values for the Scheduler.





The Scheduler configuration is now complete. Now we need to configure the Web Scraper. 

For more information, read how to configure Web Scraper

Configuring Web Scraper

Step 1: Click on the add button to view the nodes in the select node window. 




Step 2: Select the Web Scraper node to add it to your flow. 




Step 3: Click on the Edit button to configure the Web Scraper node. 




Step 4: Launch ZipRecruiter in a new browser tab and enable the Byteline Web Scraper Chrome extension




Here, we are scraping a couple of fields such as title, company name, job proposal link from the ZipRecruiter website.   


For title


Step 1: Double click on the title to select the job title you would like to scrap.




Step 2: Select the Text option to specify the data type for scraping. 



Step 3: Click on the Repeating Elements to scrape the multiple job titles over the web page. We are using repeating elements as multiple jobs are scraped from this page.


 



The Web Scraper will automatically copy the data to the clipboard. 


Step 4: In the Webscraper configuration window, click on the Paste from the Chrome Extension to paste the scraped data.




Step 5: Enter the Array Name to specify the JSON array from which you want fetch elements.

 

Step 6: Enter the title in the field and its XPath is automatically fetched in the field for scraping the Job title. 


For Company Name

Step 1: Double click on the company name to select it for scraping. 




Step 2: Select the Text option to specify the data type for scraping. 




Step 3: Click on the Repeating Elements to scrape the multiple company names over the web page. 



The webscraper will automatically save the data in the clipboard. 


Step 4: In the Webscraper configuration window, click on the Paste from the Chrome Extension to paste the scraped data.



Step 5: Enter Company in the field and its XPath is automatically fetched in the field for scraping the Company Name. 


For Link


Step 1: Double click on the job title (having hyperlink) to select the link for scraping. 




Step 2: Select the Link to specify the data type.


Note: You can also preview the link. 




Step 3: Click on the Repeating Elements to scrape the multiple company links over the web page. 




Note: The webscraper will automatically save the data in the clipboard. 


Step 4: In the Webscraper configuration window, click on the Paste from the Chrome Extension to fetch the scraped data.


Step 5: Enter Link in the field and its XPath is automatically fetched in the field for scraping the Link.

For Location


Step 1: Double click on the location to select the company location for scraping.  





Step 2: Select the Text option to specify the data type for scraping. 




Step 3: Click on the Repeating Elements to scrape the multiple company locations over the web page. 




Note: The Web Scraper will automatically copy the data to the clipboard. 


Step 4: In the Web Scraper configuration window, click on the Paste from the Chrome Extension to fetch the scraped data.



Step 5: Enter Location in the field and its XPath is automatically fetched in the field for the location. 



Step 6: Enter the web page URL in the field to scrape the web page data. 




Step 7: Once you’re done with the configuration, click on the Save button. 



Thus you have configured the web scraper to scrape the required job details. 

After the configuration of the flow, you will need to perform a test run to make sure the web scraper task works.

Run 

Click on the Test Run button to test the flow. 



Now, click on the 'i' (more information) button on the top-right corner of the Web Scraper node to check the data content extracted. 


You will see a SUCCESS window as shown in the snippet below: 




Your Web Scraper node has been configured successfully.  

Knack - Update Records

Connect your Knack database account with the Knack - Update Records Byteline node to fetch job proposal details for creating the job board. For more, you can check out our documentation, on how to configure Knack - Update Records

Follow the steps below to add Knack - Update Records task.


Step 1: Click on the add button to view the nodes in the select node window. 



Step 2: Select the Knack - Update records node to add it to your flow. 



Configure

In this section, you’ll learn how to configure the Knack - Update Records node. 


Step 1: Click on the Edit button to configure the Knack - Update Records node. 


 

Step 2: Click on the Configure it now button to access the Knack database through Byteline. 

Navigate to your Knack account and follow the below-mentioned instructions to connect the Knack database with Byteline. 


Step 3: Log in to your Knack account and navigate to the Settings tab from the left sidebar menu. 

Step 4: Select the API & code option from the settings menu and copy the application ID and API Key provided. 

 

Step 5: Go to the Byteline connection window, enter the Knack Application ID and API key

Once done, click on the Save button.

Now, you have successfully established a connection between your Knack account with Byteline. 


Step 6: Select the Object you want to update from the dropdown menu. 


Step 7: Next, tick the loop over checkbox and select the web scraper node from which you want to use the scraped jobs. Loop over makes this task run for each scraped job from the web scraper node.


Step 8: Click on the Selector button appearing next to the title field for mapping title value from the array.



Step 9: In the Output window, click on the title to pick its path.




In the Title field, the variable path will be fetched automatically. 


Similarly, you can repeat this step for mapping each value in the respective fields. 


Step 10: Click on the Save button to save the node configuration.




Once you save the configuration, the indicator over the top-right corner of the code node will turn green.


Your Knack-Update Records node has been configured successfully.


After the configuration of the flow, let's test it. 

Test Run

Run the created flow by clicking on the Test Run button on the top right corner of the interface.



You will see a SUCCESS window as shown in the snippet below: 




Handling Knack Record Updates

Byteline's update tasks also handles updates and deletes by configuring the primary keys from the task field mapper. Based on the configured primary keys, Byteline finds matching records to update or delete.

Let's test it out. For this purpose we will delete the existing 2 records to simulate new scraper records. Let's go to Knack Update advanced tab and then select the update radio button to set "Overwrite"


Step 1: Navigate to your Knack account and click on the JobBoard from the left sidebar to view the scraped job postings with title, location, link, and company name. 


Step 2: Select any of the two entries (First and Second are selected in this case) from the list and click on the delete button (if you want to remove data). 



Step 3: Run the flow once again to see the updated output results, as shown in the snippet below. As expected, two inserts are made for the records that we manually deleted from the Knack database, and it updated the rest of the records.



You have successfully scraped the job board from the ZipRecruiter using Knack - Update Records. 

If you have any doubts, feel free to connect with us.